Skip to main content

journal-sequence-fix

Summary of the Problem

Ember has generic journal repair tool that tries to recover journal corruption introduced by failed media or disk IO. This tool is described in [[journal-recover]] document. This document describes how to fix a different type of problem introduced in Ember version 1.6.96 and up to 1.7.22.

In version 1.7.29 we introduced data integrity check that verifies that message sequence numbers do not have duplicates. Message sequence numbers are used as order identifiers by some ember algorithms and trading connectors. This check detected some out-of-sequence situations in journals that were upgraded from ember 1.6. While we found and fixed cause of these situations, you may need to run journal-fix utility to fix your journal data.

Sample error message you can see during Ember start up when journal sequence message number is broken:

java.lang.IllegalArgumentException: Sequence: 294168 more than in message: 294165
    at deltix.ember.service.EmberInvoker.updateSequence(EmberInvoker.java:372)
    at deltix.ember.service.EmberInvoker.invoke(EmberInvoker.java:210)
    at deltix.ember.service.EmberInvoker.invoke(EmberInvoker.java:19)
    at deltix.anvil.service.proxy.ServiceInvokerBridge.invoke(ServiceInvokerBridge.java:21)
    at deltix.anvil.service.proxy.journaled.JournaledServiceWorker$1.onMessage(…)
    at deltix.anvil.journal.JournalReader.read(JournalReader.java:65)

This problem corresponds to internal defect number #615. The rest of this document describes repair procedure.

Prerequisites

  • Ember process itself and ALL ember satellite processes such as Ember Monitor, Ember Data Warehouse, Ember Journal Compactor, as well as Deltix Trading node must be STOPPED when journal repair takes place.
  • Make sure you have enough disk space. Measure size of $EMBER_WORK/journal directory and ensure that you have that much free space (the tool will make a backup copy of your journal).

Repair Procedure

There is a special utility called journal-sequence-fix that should be used for journal sequence number repair.

Do not forget to define EMBER_WORK environment variable, this tool relies on it to locate $EMBER_WORK/journal subdirectory. If you are using Docker image, this environment variable is already defined.

Once started the tool will ask for your confirmation:

$/opt/deltix/ember/bin/journal-fix
       Processed: 8.0 MB of 815.4 MB. Duration: 0 seconds. Remaining: 33 seconds. Rate: 23.8 MB/s
Resetting sequence: 294168 to: 294164.
Message: {"$type":"OrderStatusRequest…,"sequence":294165,"symbol":"LTCUSD",…}
       Processed: 815.4 MB of 815.4 MB. Duration: 3 seconds. Rate: 217.1 MB/s

Journal has been FIXED (problems fixes: 1). Fixed journal is stored in /var/deltix/emberwork/journal-fixed-1614107801160.
Would you like to backup and replace the existing journal with fixed copy? (y/n): y

Journal is OK. Fixes: 1. Backup is stored in /var/deltix/emberwork/journal-backup-1614107801160.

Backup of the previous journal will be saved under EMBERWORK in `journal backup_NNN` directory.

Running this utility under Docker:

docker run --entrypoint "/opt/deltix/ember/bin/journal-fix" -it \
-v "/home/staging/EmberHome/home:/var/lib/emberhome" \
-v "/home/staging/EmberHome/work:/var/lib/emberwork" \
"artifactory.epam.com:6193/deltix.docker/ember/ember-tssr-algo:0.6.6"