Import Overview

Architecture Overview

In general, the import assumes the following setup:

  • a Camunda engine, where the data is imported from.
  • the Optimize back-end, where the data is transformed to a appropriate format for efficient data analysis.
  • Elasticsearch, which is the database of Optimize, where the formatted data is persisted to.

The depicts the setup and how the components communicate with each other:

The main idea is hereby, that Optimize queries the engine data using the REST-API and transforms the data in such a way that it can be easy and fast queried by Optimize. In order, to prevent the engine from producing to much load from the Optimize queries and on the same time make the import fast, Optimize adapts the amount of queries automatically to the engine’s response time.

What is more, one should be aware of what are the general requirements for the data in Optimize:

  • Optimize does not own the data of the engine. The Optimize data set can always be removed and reimported or adapted to the needs of Optimize.
  • The data is only a near real-time representation of the engine database. That means, Elasticsearch may not contain the data of the most recent time frame, e.g. the last two minutes, but all the data before should be synchronized.

If you are interested in the details of the import, have a look at the designated section Import Procedure.

Import performance overview

This document gives an overview of how fast optimize imports certain data sets to get a feeling of the import speed of Optimize and if it meets certain demands. However, this presumably changes on different data sets and how all involved components are set up. E.g., if you deploy the Camunda platform on a different machine than Optimize and Elasticsearch the process is likely to speed up.

Setup

The following component were used within the import:

Component Version
Camunda Platform 7.7.0 on a Tomcat 8.0.24
Elasticsearch 5.4.0
Optimize 1.0.0

As the Optimize configuration were the default settings used, that are described in detail in the configuration overview.

All three components were running on a single laptop with the following components:

  • Processor: Intel® Core™ i5 (6. Generation),6440HQ Processor,4x 2.60 GHz
  • Working Memory: 16 GB (DDR4)
  • Storage: 192 SSD (SSD)

Then, the time was measured from the start of Optimize until the whole import of the data to Optimize was finished.

Medium size data set

The set contains the following amount of instances:

Number of Process Definitions Number of Activity Instances Number of Process Instances Number of Variable Instances
46 1 427 384 261 106 1 273 324

Here, you can see how the data is distributed over the different process definitions:

Results:

  • Duration of importing the whole data set: ~30 minutes
  • Speed of the import: 1500-2000 database rows per seconds during the import process

Small data set

The set contains the following amount of instances:

Number of Process Definitions Number of Activity Instances Number of Process Instances Number of Variable Instances
10 663 424 62 897 2 034 905

Here, you can see how the data is distributed over the different process definitions:

Results:

  • Duration of importing the whole data set: ~23 minutes
  • Speed of the import: 1700-2200 database rows per seconds during the import process

On this Page: