Skip to main content

Journal Recovery for older Ember version

Ember provides a tool to recover journal from TimeBase data warehouse. You can read about it here.

Unfortunately this tool is only available for Ember version 1.14.13+. This article describes workaround steps for a client who had Ember version 1.11.X in PROD.

Recovery path was be the following:

  1. After catastrophic power outage attempt to use journal-recover tool was unsuccessful - lost uncommitted portion of journal had entire day of trading activity. HA/DR options were not deployed, neither was Ember configured to auto-flush journal data files.
  2. We made a decision to use "EmberMessages" TimeBase Data Warehouse stream as recovery source. This stream was copied from PROD to working machine.
  3. We installed Ember 1.14 on working machine.
  4. Launched Ember service and created empty journal (preparation step).
  5. Run journal-import that produced Ember journal from TimeBase messages stream
  6. Run journal-downgrade tool that converts journal version from 1.14 down to 1.11.
  7. Uploaded produced journal back to client PROD.

Output of the Step 4:

$bin/journal-import -unit timebase -term 1673440967949

This tool will import journal from timebase warehouse.
Would you like to proceed? (y/n): y
2024-11-19 12:32:48.031 INFO [main] Selected time source: KeeperTimeSource (default, implicitly configured by TickDBClient)
2024-11-19 12:32:48.770 INFO [main] Importing journal with term 1673440967949 from timebase warehouse...
[6.588s][info][gc] GC(0) Pause Young (Concurrent Start) (Metadata GC Threshold) 139M->17M(4096M) 8.727ms
[6.588s][info][gc] GC(1) Concurrent Mark Cycle
[6.594s][info][gc] GC(1) Pause Remark 18M->18M(4096M) 1.714ms
[6.594s][info][gc] GC(1) Pause Cleanup 18M->18M(4096M) 0.003ms
[6.602s][info][gc] GC(1) Concurrent Mark Cycle 14.000ms
Processed: 0.0 %[8.470s][info][gc] GC(2) Pause Young (Normal) (G1 Evacuation Pause) 203M->41M(4096M) 16.320ms
Processed: 0.6 %[9.275s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) 219M->45M(4096M) 8.434ms
Processed: 15.1 %[11.541s][info][gc] GC(4) Pause Young (Normal) (G1 Evacuation Pause) 239M->39M(4096M) 2.098ms
Processed: 35.2 %[43.419s][info][gc] GC(5) Pause Young (Normal) (G1 Evacuation Pause) 2493M->38M(4096M) 3.027ms
Processed: 58.7[68.268s][info][gc] GC(6) Pause Young (Normal) (G1 Evacuation Pause) 2492M->38M(4096M) 2.308ms
Processed: [92.043s][info][gc] GC(7) Pause Young (Normal) (G1 Evacuation Pause) 2492M->38M(4096M) 1.804ms
Processed: [115.695s][info][gc] GC(8) Pause Young (Normal) (G1 Evacuation Pause) 2492M->38M(4096M) 2.680ms
Imported: 13842481 messages. Duration: 115 second
2024-11-19 12:34:44.754 INFO [main] Timebase connection closed by API call
2024-11-19 12:34:44.758 INFO [main] Successfully imported journal from timebase warehouse
2024-11-19 12:34:44.758 INFO [main] Backup of the Ember journal is stored in /deltix/emberhome/journal_backup_2

Output of Step 5:

$bin/journal-downgrade 24

This tool will downgrade journal format to version 24.
Would you like to proceed? (y/n): y
Downgrading journal from version 26 to 24...
Processed: 4.1 GB of 4.1 GB. Duration: 16 seconds. Rate: 259.1 MB/s
Successfully downgraded journal to version 24
Backup of the journal is stored in /deltix/emberhome/journal_26_backup_1

At this point /deltix/emberhome/journal/ was moved to PROD environment. Client confirmed presence of most recent orders in the PROD after restart.

note

Described procedure does not recover any risk limits that were defined in the system and kept in original client journal. Check if you need to re-define risk limits before going live.