Data Warehouse Pipelines
Overview
Ember uses a highly efficient storage system called Ember Journal to store all trading messages, including order requests and order events. Ember can process more than 100,000 orders per second. At this rate, one gigabyte of trading history data can be accumulated every minute.
While Ember is running, data warehouse pipelines are responsible for streaming or batching trading history to various destinations suitable for open data analytics and permanent storage. Ember supports the following data warehouses:
Data warehouses work in coordination with the Ember Journal Compactor to ensure that the operational storage size remains compact and all trading history is preserved.
Ember retains an operational subset of data in memory, stores recent trading data in the journal, and streams all data to data warehouses, where it can be stored indefinitely. From this perspective, neither the Ember API nor the Ember Journal is the optimal place to retrieve information like "show me all trades for today." Instead Ember delegates this task to data warehouses.
This design offers several advantages:
- The Ember Journal can be optimized for rapid sequential data insertion.
- The operational dataset can stay small.
- Reporting queries don't overwhelm Ember RPC channels.
Different warehouses can be set up to run in parallel.
Comparing Data Warehouses
Here's a comparison of the various data warehouses:
TimeBase | ClickHouse | Kafka | S3 | RedShift | RDS SQL | |
---|---|---|---|---|---|---|
Max Rate (orders/sec) Sustained | 500K + | 200K | TBD | 15K | 250 | 50 |
Reports Performance | Very Good | Adequate | ||||
Query Language | QQL (Limited) | SQL subset Very Good | KQL | Athena uses Presto SQL | SQL subset | |
Maintenance Effort | Medium-High | High | Medium | Very Low | Low | Low |
Storage Cost | High | High | High | Low | High | High |
GUI Client | TimeBase Admin | Tabbix | KafkaTool, etc. | AWS Athena Console | Any SQL client | Any SQL client |
For a more detailed description of data warehouse configuration, visit the Ember Configuration Guide.