Skip to main content

FIX Gateway Administrator Guide

Overview

Deltix FIX Gateway is part of the Deltix QuantServer solution for brokers, exchanges, and dark pools.

Deltix FIX Gateway can route orders and receive market data from the following points:

  • Deltix Connectors to various execution venues. Deltix has experience connecting to more than a hundred brokers and exchanges, ranging from exchanges like CME and Eurex to brokers like Bloomberg and CTP.
  • Deltix Execution Algorithms (SOR, TWAP, VWAP, ICEBERG, etc.).
  • Deltix Matching Engines.
  • Deltix Exchange Simulators.

FIX Order Flow can be subject to customizable pre-trade risk checks. The Deltix FIX Gateway component implements Order Entry and Market Data Gateways using FIX Protocol 4.4.

Design

This product can be deployed as a group of Order Entry and/or Market Data FIX gateways running on a single or multiple servers. Each gateway uses two CPU cores and handles a specific group of FIX client sessions (usually 100-1000).

FIX Gateway diagram

The Market Data Gateway works with the Deltix TimeBase and Aggregator products to distribute market data. The Order Entry Gateway works with the Deltix Execution Server and TimeBase to router orders and re-broadcast order events.

Market Data Gateway

Deltix FIX Gateway may run multiple instances of a Market Data Gateway (MDG). Each gateway re-broadcasts data from a pre-configured set of TimeBase streams (Deltix internal messaging middleware and time-series database).

Each Market Data Gateway consists of Transport and Session components that run as separate threads and are typically assigned to specific CPU cores.

This section describes how the Session component works.

Main Logic

Market Data Gateway performs two main tasks in a repeated alternating pattern:

  • Ingest - Receive market data to be published.
  • Emit - Send market data to clients, one symbol at a time.

This pattern can be likened to a pump or heart, as it receives data in, then sends it out, then receives data in again, and so on.

Ingest Phase

The Ingest phase only executes if there is new data immediately available without blocking. All ingested data is grouped by symbol. The MGD tracks the market state (order book) for each symbol independently, accumulates all trades for that symbol, and tracks if specific symbols recieved any updates.

When the MDG decides to send data for a specific symbol, it sends the same data to all clients subscribed to that symbol.

note

The MDG utilizes the fact that most outbound FIX messages contain similar fragments of data that can be reused between clients. This optimizes network bandwidth usage by sending shared parts of the message only once, and the unique data for each client separately.

The MDG only sends data to subscribed clients if:

  • The market data has changed since the last flush OR
  • A new client was subscribed for this specific symbol

Emit Phase

During the Emit phase, the MDG picks a symbol to be sent on a round-robin basis, following a fixed order defined in the MDG configuration.

If there is too much trade data accumulated for a symbol during ingestion, that symbol's data is immediately sent (flushed) during the ingestion phase. See the messageBufferSize configuration option.

Client Message Order

To ensure fairness among all clients consuming market data, the FIX Gateway rotates consumers for a symbol on each flush. This means that the order of messages is different for each update, and all clients receive market data in a fair and balanced manner.

For example, suppose we have four FIX clients, A, B, C, & D:

  • Order of messages in 1st update: A, B, C, D
  • Order of messages in 2nd update: B, C, D, A
  • Order of messages in 3rd update: C, D, A, B
  • Order of messages in 4th update: D, A, B, C
  • Order of messages in 5th update: A, B, C, D
  • Etc.
note

The exact method used to shift the round-robin order may vary between different versions of the FIX Gateway.

Market Data Type

Deltix Market Data Gateway supports two market data feed modes:

  • Periodic Snapshots only – This mode broadcasts snapshots of the order book (top N level) with a given interval (e.g. 10 milliseconds). It is optimized for simplifying client use and supporting a large number of connected clients.
  • Incremental Updates and Snapshots – This mode is designed for low-latency market data dissemination and allows delivering every market data update to a smaller number of FIX clients.

Configuration Options

  • maxLevelsToPublish - Specifies the number of levels to be included in the published order book snapshot. Higher values result in larger snapshot message sizes and increased outbound traffic.

    Note: This option is applicable only for the snapshot-only market data gateway type. It cannot be used for the incremental market data gateway type.

  • expectedMaxBookLevels - Specifies the number of levels expected in the published order book. This value must match the order book depth recorded by the exchange (e.g. the Aggregator Data Connector setting outputBookSize).

    Note: This option can be configured for the incremental market data gateway type.

  • messageBufferSize - Specifies the size of the FIX message buffer (one per symbol). It determines the maximum size of an outbound FIX message. The buffer size controls how many trade messages can be batched together during the Ingest phase. A lower buffer size results in a higher number of FIX messages. The default size is 10 kilobytes, and values like “20K” (20 kilobytes) can be used. If the buffer size is insufficient, the “flush” happens during the “Ingest” phase and regular interval updates are broken.

  • minimumUpdateInterval - Specifies the minimum time interval between two snapshots (per symbol). This option prevents too frequent snapshots. The MDG does not send new data for a symbol to clients if:

    • The symbol's order book was updated less than the specified number of milliseconds ago
    • AND the outbound FIX message buffer is not full

    If the outbound message buffer for a symbol is full during the Ingest phase, this option is ignored. This option can be set as low as 0. The default value is 10 milliseconds.

    Note: In incremental mode, this parameter controls how often the MDG tries to send snapshots for any client that "needs" a snapshot. A client "needs" snapshot if:

    • The client just connected and haven’t recieved any snapshots or
    • The MDG had a queue overflow and missed some incremental messages, and must re-send full data set to the client. Normally, the incremental mode does NOT broadcast periodic snapshots.
caution

Setting minimumUpdateInterval to zero may produce an order book snapshot after each incoming order book update, resulting in a high number of updates. FIX clients may not be able to handle handle such a flow, and the overall latency may become very bad. The only reasonable case when it makes sense to use a zero interval is when it's known that the inbound message frequency is low and will not overwhelm FIX clients.

  • staleMarketDataTimeout – Can be used to detect stale market data feeds (when connection seems up but no data is flowing for a data source that normally never goes quiet). This option is disabled by default.

    Caution: Do not set the staleMarketDataTimeout parameter without consulting with Deltix Tech Support first. This option may interfere with the normal operation of the MDG. For example, it may cause issues when upstream data has regular maintenance hours or an internal matching engine that doesn’t recieve many orders (in these cases, a quiet period is expected).

  • sendQueueSize – Specifies the size of the send buffer of each transport session, in bytes. The default is 32 megabytes (32M).

Outbound Socket Buffer Size Example

Here is a pseudocode example for computing the required outbound socket buffer size:

bufferTimeCapacityMs = 1000

estimatedMaxMarketMessageSize = 256 + 2 * 64 * maxLevelsToPublish

messagesToBuffer = bufferTimeCapacityMs / minimumUpdateInterval

socketSendBufferSize = estimatedMaxMarketMessageSize * messagesToBuffer

Above is the recommended minimum value. The optimal value is about 4x greater, depending on the use case.

For example, you may want to set it to 256K:

sysctl -w net.core.wmem_default=262144
sysctl -w net.core.wmem_max=262144
note

If you are experiencing disconnects due to backpressure, the first thing to check is the socket send buffer size.

Here is a sample of how to adjust the socket send buffer size for docker-compose:

ember:
    app: …
        sysctls:
            - net.core.wmem_default=262144
            - net.core.wmem_max=262144

Note that net.core.wmem_default and net.core.wmem_max are the default and maximum send buffer size respectively, measured in bytes.

Flow Control

In case of high load, the main bottleneck for the MDG is the outbound message queue.

The MDG uses the following pressure relief approach:

  • If the MDG fails to send order book data (tags 269=0 or 269=1), the message is discarded.
  • If the MDG fails to send trade data (tag 269=2), the affected client gets disconnected.

In the Transport component of the MDG, a failed send attempt due to a full outbound buffer results in the immediate disconnect of a client.

Behavior on Different Load Levels

This section describes behavior of the Snapshot-only Market Data Gateway (MDG), which changes with the amount of data coming from the exchange.

The “load levels” described below are examples of MDG behavior in varying circumstances. Please keep in mind that there are no explicit “load levels” in the MDG itself.

In snapshot-only mode, the MDG publishes two types of messages: market data snapshots and trades. While market data snapshots can be throttled (the most recent message effectively describes the accurate state of the market), trades cannot be.

No Load

If there is no data from the upstream feed, the MDG runs idle. No data is sent to clients.

Counter values:

Transp.ActiveCycles ~ 0, Transp.IdleCycles > 0, Transp.SendQueueSize = 0
Low Load

Under low load, any arriving update from an exchange triggers a “flush” for the affected contract (but no more often than minimumUpdateInterval).

Counter values:

Transp.ActiveCycles-rate > 0, Transp.IdleCycles-rate > 0, Transp.SendQueueSize < sendQueueSize * 1%
Medium Load

Under medium load, regular upstream data does not cause any problems for outbound network traffic. In this mode, snapshots for each (active) contract are published every minimumUpdateInterval.

Counter values:

Transp.ActiveCycles-rate > 0, Transp.SendQueueSize < sendQueueSize * 25%, Server.MDPublishDelayBackPressure-rate = 0
High Load

Under high load, the MDG receives too many market events to push both frequent snapshots and trades to clients. In this case, the MDG sacrifices the frequency of orderbook snapshots in favor of trades. The more market data comes that in, the higher the percentage of trades of the outbound traffic and the lower the frequency of published market data snapshots.

Counter values:

Server.MDPublishDelayBackPressure-rate > 0, Server.TradePublishFailure-rate = 0; also Transp.SendQueueSize > sendQueueSize * 50%
Overload

MDG becomes overloaded when the outbound networking layer simply cannot publish all upstream trade messages. In this situation, the MDG relieves overload pressure by reducing the number of connected clients. There is no strict disconnect policy in this scenario. The overload protection policy is per contract. One or several clients subscribed to the overloaded contract can be disconnected.

Counter values:

Server.MDPublishDelayBackPressure-rate > 0, Server.TradePublishFailure-rate > 0

Example of MDG Working Cycle

Let us assume a gateway needs to publish prices for the following contracts: AAA, BBB, CCC, DDD.

The order of events could be:

“Ingest” phase 1:

  • Get update #1 on BBB
  • Get update #2 on AAA
  • Get update #3 on DDD
  • Get update #4 on AAA
  • Get update #5 on AAA

“Emit” phase 1:

  • Send snapshot with updates #2, #4, #5 for symbol AAA

“Ingest” phase 2:

  • No new data -> skip

“Emit” phase 2:

  • Send update #1 for symbol BBB

“Ingest” phase 3:

  • Get update #6 on AAA

“Emit” phase 3:

  • No accumulated updates for CCC -> skip
  • Send update #3 for symbol DDD

“Ingest” phase 4:

  • Get update #7 on DDD
  • Get update #8 on CCC

“Emit” phase 4:

  • We have data for AAA but we sent snapshot for it recently (less than 10ms) -> skip
  • No accumulated updates for BBB -> skip
  • Send update #8 for symbol CCC

“Ingest” phase 5:

  • No new data -> skip

“Emit” phase 5:

  • We have data for DDD but we sent a snapshot for it recently (less than 10ms) -> skip
  • We have data for AAA but we sent a snapshot for it recently (less than 10ms) -> skip
  • No accumulated updates for BBB -> skip
  • No accumulated updates for CCC -> skip
  • Nothing to do for now

...After 10ms of market inactivity...

“Ingest” phase 1001:

  • No new data -> skip

“Emit” phase 1001:

  • Send update #6 for symbol AAA

“Ingest” phase 1002:

  • No new data -> skip

“Emit” phase 1003:

  • No accumulated updates for BBB -> skip
  • No accumulated updates for CCC -> skip
  • Send update #7 for symbol DDD

Order Entry

Configuration Options

  • sendCustomAttributes – When enabled, the FIX gateway populates ExecutionReport messages with custom attributes from order events.
  • sendEmberSequence - When enabled, the FIX gateway populates ExecutionReport FIX messages with an Ember message sequence (Journal message sequence number) using tag 9998.

Trader ID Resolution

By default, the gateway uses the FIX message tag SenderSubId(50) to convey trader identity.  This method follows an approach chosen by other exchanges, like CME iLink. To override this, use the traderIdResolution setting of each order entry gateway.

  • SENDER_SUB_ID - Uses FIX tag SenderSubId(50) to convey trader identity (default mode).
  • SESSION_ID - Uses FIX tag SenderCompId(49) to convey trader identity.
  • CUSTOMER_ID – Provides a traderID with each FIX session through the Ember configuration file.
  • DTS_DATABASE – Associates one or more CryptoCortex User IDs (GUIDs) with each session through the CryptoCortex Configurator (new since Ember 1.8.14). In this case, the FIX tag SenderSubId(50) must match with the CryptoCortex user ID (GUID). FIX Gateway validates that the order's user is indeed associated with the specific FIX session.

Example:

gateways {
  trade {
    OE1 {
      settings {
        traderIdResolution: DTS_DATABASE
        ...

Custom FIX Tags Forwarding

The FIX Order Entry gateway can be configured to pass a custom set of FIX tags from inbound FIX message as custom attributes in a normalized OrderRequest message.

Before version 1.14.34, the FIX Gateway passed the following custom tags:

  • Text(80)
  • ExecInst(18)
  • ClOrdLinkID(583)
  • ContingencyType(1385)
  • Any tags in the 6000-8999 range

Starting with version 1.14.34, the default set of custom attributes ranges from tags 18,6000-18,8999.

The set of custom tags can be customized by setting the customAttributesSet option in the configuration file:

gateways {  
trade {
OE1 {
settings {
customAttributesSet: "80,1024,6000-8999"
...

Please note:

  • If your FIX clients sends you complex orders (e.g., bracket orders), make sure to include tags 583 and 1385 in the customAttributesSet option. This covers the complex ClOrdLinkID and ContingencyType order parameters.
  • If the Text(80) tag is used to pass any important information in the FIX order message, it should also be included in the customAttributesSet list.

Message Transformer

The Order Entry Gateway can transform inbound order requests before they are placed into the Execution Server's OMS queue. This customizable logic can be used, for example, to correct an order destination.

Here is an example of a built-in transformer that modifies each order request's Destination fields based on the specified exchange.

transformer: {
  factory = "deltix.ember.service.engine.transform.CaseTableMessageTransformerFactory"
  settings {
    rules: [
      // (Destination1)? | (Exchange1)? => (Destination2)? | (Exchange2)?
      "*|DELTIXMM => DELTIXMM|DELTIXMM",
      "*|HEHMSESS1 => HEHMSESS1|HEHMSESS1",
      "*|HEHMSESS2 => HEHMSESS2|HEHMSESS2"
    ]
  }
}

User Identification

The FIX Session tag SenderCompID(49) plays a key role in identifying client messages downstream. The value of this tag becomes the Source ID of requests and the Destination ID of response messages in downstream APIs.

To maximize efficiency, the current implementation uses an ALPHANUMERIC(10) codec to convert the text value of this tag to INT64 values circulating inside the Deltix system. The maximum length of this identifier is 10 characters.

The client session identifier SenderCompID(49) must be unique across the entire system.

Typically, each client gets a pair of FIX sessions: one for Market Data and another for Order Entry. These two sessions can be hosted on the same or different FIX gateway hosts.

Client Database

The database of FIX sessions can be stored in a static configuration file or database (most traditional SQL databases are supported). Other sources can be easily implemented on request.

Security

Each FIX Session uses a dedicated port. This allows for a firewall configuration that opens each individual port to a specific source IP of FIX client (IP whitelisting). When firewall-based source IP checking is not available, valid source IPs can be specified in settings of each FIX session.

Deltix relies on a third-party SSL termination mechanism to encrypt FIX traffic, such as software solutions like stunnel or AWS NLB. An SSL layer is required for production deployments.

Client Authentication

Each FIX client must provide a password with a LOGON message to establish a new session.

Client Database FIX Gateways support a simple password checking mechanism (e.g. hashed passwords in a text file or SQL database), as well as a custom asynchronous authentication mechanism, such as a REST or RabbitMQ microservice. See the Ember Configuration Reference for more information.

Logging

FIX Gateway itself does not provide a built-in ability to capture FIX logs. This was an intentional design decision. When FIX log capture is required, a specialized network packet capture solution can be used. In the simplest case, this could be software packages like tcpdump or tshark. Hardware capture (e.g. using router port mirror) can be used in high-end cases. Solutions like Amazon Traffic Mirroring can also be useful. For more information, see the appendices at the end of this document.

November 2021 Update: Deltix now provides a specialized Docker container called fix-logger that can capture FIX messages for given server port(s). FIX messages are dumped to the console output and can be redirected to the log aggregator of your choice. Default implementation provides support for Graylog. For more information, see Appendix A.

Performance

System performance characteristics depend on a lot of factors, including:

  • The number of connected FIX clients.
  • User activity patterns - Aggressive traders put a lot of stress on trading event backflow, while passive traders lead to a very deep order book and potentially increase market data events backflow.
  • Network bandwidth - Under some loads, the FIX gateway can consume the entire network bandwidth whether its 1G or 10G.
  • CPU Speed - Formatting and parsing FIX messages is fairly CPU intensive. Matching Engine and Execution Algorithms can also be CPU intensive.
  • Disk Speed (Journal recording speed).

Current benchmark results are available as separate documents. Deltix is constantly working on performance optimization of FIX Gateway and core downstream modules.

Ballpark numbers: The Dell PowerEdge R630 Server should be able to serve about 500 actively trading FIX users (assuming each user generates a trading flow of 250 requests per second and receives back 2-5 events per request). All flows are optimized for low latency rather than high throughput. The FIX to FIX pass through latency of FIX Gateway is measured in single digit microseconds.

Scalability

Vertical Scalability

FIX Gateway supports a set of connections by using two CPU cores: one for the transport level and another for the FIX session level and encoding/decoding. Multiple instances of FIX gateway can be launched on a single server, providing there are enough hardware resources.

Horizontal Scalability

FIX Gateway uses the Deltix TimeBase and Aeron UDP messaging framework to communicate via a high throughput / low latency UDP network protocol with downstream services like Deltix OMS, Matching Engine, or execution algorithms. Multiple FIX gateway servers can be employed to share client load.

Overload Protection (Flow Control)

Under severe load, FIX Gateway takes protective measures. The system attempts to reduce the amount of additional work entering it. For example, if the downstream system is overloaded, the FIX Session layer starts rejecting inbound order requests.

FIX Gateway flow control

High Availability

FIX Gateway is stateless. In the event of server failure, FIX clients can reconnect to a backup server.

Monitoring

FIX Gateway publishes performance counters that can be monitored using various tools.

As a reference, Deltix provides integration with Zabbix. Zabbix allows charting vital metrics and setup alerts. Other monitoring systems can be supported on user request. More information about monitoring can be found in the Monitoring Ember Metrics document.

Some counters are described in Appendix C: Monitoring Counters.

Execution Server Monitor

The Execution Server Monitor web-app has several panels described in the sections below.

FIX Sessions

The FIX Sessions panel allows operators to perform the following actions:

  • Select a FIX gateway
  • See information about each session
  • Aggregate statistics about the whole gateway

Operators can disconnect or enable/disable the selected FIX Session.

FIX Gateway sessions screen

Orders & Trades

On the Orders and Trades panels, operators can inspect the details of specific orders, and cancel or discard orders if necessary.

FIX Gateway orders screen

Kill Switch

The Execution Server Monitor has a kill switch in the form of a Halt Trading button, located in the upper right corner. When activated, this button automatically rejects all new order requests.

Appendix A: FIX Traffic Capture

Capture using FIX Logger utility

FIX-logger is a tool for capturing messages from FIX sessions and logging their contents in text form either to standard output or to Graylog. It captures multiple sessions (unencrypted, TCP) connected to a single specified host and provides some additional data that can be useful with debugging, such as src/dst IP addresses, timestamps and total packet statistics. The application can be also used to extract FIX sessions from packet capture (.pcap) files.

FIX-logger currently supports mainstream x86-64 Linux systems, including MUSL-based (Alpine Linux). arm64, Mac OS support is possible, but not currently distributed.

Quick tool usage example

Here we capture FIX sessions that are bound to TCP port range 7001-7100 using network interface eth0:

docker run --network host --rm -it registry.deltixhub.com/deltix.docker/fix-tools/fix-logger-alpine:0.6.5 --device=eth0 --ports=7001-7100
Tool arguments
  • --help will print version, some usage examples, brief parameter description and exit.
  • --list - list capture devices.
  • --silent, -s - no logging. May be useful when processing stdout without filtering.
  • --verbose, -v - extra logging.
  • --device= - specify capture device. Mandatory parameter, unless pcap filename is used instead. You can use either device name or integer index as argument.
  • --host= provide FIX server address. Address or at least port range are mandatory to obtain usable output.
    • Should contain IP4 address and/or port(s), separated by :
    • If IP4 part is omitted, : is still mandatory
    • If port/port list is not specified, it should be provided via --port argument
    • Few valid example values: 10.10.1.234, 192.168.1.10:8000-8099, :2000-2003,3000-3009,8888
  • --port= / --ports= - specify port(s) (see below) separately from host address. --port=1224 has the same meaning as --host=:1234. Mandatory, if port list is not specified elsewhere.
  • --console-mode= - specify stdout output header format. Possible choices: {disabled,brief,extended}. brief if the default mode. Examples below:
    • 2023-06-12 09:09:05.137213,OUT,8=FIX.4.4|9=... - brief
    • [2023-06-12 09:09:05.137213 SESSION: 1 DIR: OUT (10.0.1.181:10002->10.0.1.124:59686)] - extended. FIX message starts from the next line.
  • --interactive, -i - enable raw keyboard processing. Can use keys like Esc or Enter. Otherwise, can only terminate the program with Ctrl-C or SIGINT/SIGTERM signals. Older versions used interactive mode by default.
  • --pcap-thread-affinity=<CPU list> - pin the main worker thread to the specified CPUs
  • --worker-thread-affinity=<CPU list> - pin the UI thread to the specified CPU
  • --gelf-tcp= - address and port for Graylog server. TCP/GELF input is expected. Example graylog.my-company.com:4321. See Graylog output above.
  • --gelf-no-timestamp - do not set actual capture timestamp in the messages output to Graylog.
  • --mt-mode= - Multithreading mode. Integer value [0..2]. The application can use 1 or 2 threads to process the data. When operating in a docker container, it often makes sense to use --mt-mode=1 to switch into single-threaded mode. Single-threaded mode is also always used when processing pcap files offline. Otherwise, this parameter exists for troubleshooting/debugging.

Where a port list is a list of comma-separated port numbers or ranges. Example: 1000-1003,1005,1009,1100-1110

CPU list is a list of comma-separated CPU indices or ranges, starting from 0. Example: 0-3,8

Tool limitations
  • The tool expects TCP/IP packet stream without losses and retransmissions, otherwise the received FIX messages can be malformed, warning or errors will be displayed in such situation.
  • VLAN tags probably won't work
  • WiFi capture likely won't work
  • IPv4 only, no support for IPv6 yet
  • Use IP address, not hostname, when specifying FIX server network address. This is to avoid ambiguity that may be caused by DNS resolution.

FIX logger running as Kubernetes Pod

To capture traffic on port 9001 and send the captured messages to Graylog, use the following example Kubernetes pod:

        - name: fix-logger
image: >-
registry.deltixhub.com/deltix.docker/fix-tools/fix-logger-alpine:0.6.5
command:
- /bin/sh
- '-c'
- >-
/fix-logger --device=eth0 --port=9001 --host=$(POD_IP) -v
--mt-mode=1 --gelf-tcp=monitoring-graylog.monitoring:12201
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: CONTAINER_NAME
value: fix-logger
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 500m
memory: 512Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent

Note: -v provides verbose output to the console, but it is likely not needed.

Capture using tcpdump

Given that each client connection utilizes a unique port, it's easy to capture messages between the FIX Gateway and a specific client connected on a known port.

To capture traffic for port 10011 and network interface ens3, use the following example:

tcpdump -A -i ens3 port 10011 -w capture.pcap

Capture using tshark

Tools like tshark and wireshark have support for FIX Protocol. Find a good tshark tutorial to customize recordings to your specific needs.

A basic example is:

tshark -l -n -i ens3 -t ad -R 'fix.SenderCompID == "DELTIX" or fix.TargetCompID == "DELTIX"' -f 'port 10011' -V

Sample output:

Financial Information eXchange Protocol
BeginString (8): FIX.4.4
BodyLength (9): 0327
MsgType (35): 8 (EXECUTION REPORT)
SenderCompID (49): DELTIX
TargetCompID (56): MAKER1
SendingTime (52): 20190311-17:34:41.075
MsgSeqNum (34): 2
SenderSubID (50): 9C8217B5-4E36-4D2D-822C-A71E22154044
Account (1): SingleOrderTest
AvgPx (6): 0
ClOrdID (11): 7379694062678251088
CumQty (14): 0
ExecID (17): 1552312318433
ExecInst (18): M (MID PRICE PEG)
HandlInst (21): 1 (AUTOMATED EXECUTION ORDER PRIVATE NO BROKER INTERVENTION)
OrderQty (38): 1
OrdStatus (39): 8 (REJECTED)
OrdType (40): P (PEGGED)
Side (54): 1 (BUY)
Symbol (55): BTCUSD1
Text (58): Order symbol is not defined in Security Metadata database
TimeInForce (59): 1 (GOOD TILL CANCEL)
TransactTime (60): 20190311-17:34:41.074
ExecBroker (76): OMEGADARK
ExecType (150): 8 (REJECTED)
LeavesQty (151): 0
SecurityType (167): FOR (FOREIGN EXCHANGE CONTRACT)
CheckSum (10): 019 [correct]
[Good Checksum: True]
[Bad Checksum: False]

This example captures traffic on port 10011 and network interface ens3. It filters messages where either the SenderCompID or TargetCompID is "DELTIX". The -V flag prints verbose output to the console.

The output includes various fields like MsgType, SenderCompID, TargetCompID, SendingTime, MsgSeqNum, and so on. The values of these fields can be used to analyze FIX traffic for specific purposes.

For example, the sample output above shows an execution report that was rejected due to an unknown order symbol.

Appendix B: FIX Log Capture using AWS Traffic Mirroring

When FIX Gateway is hosted on AWS, you can use Traffic Mirroring to capture FIX logs. With Traffic Mirroring, you can mirror FIX traffic using user-controlled filters and copy it to the network interface of another host.

VPC Flow capture

Setup instructions

To set up FIX log capture using AWS Traffic Mirroring, follow the steps outlined in steps 1-5 below.

Step 1: Create a FIX Log Monitoring Instance

To create an instance that is used as a traffic target, follow these steps:

  1. Configure two network interfaces for the instance:

    1. eth0 for administrative traffic
    2. eth1 for traffic capture

      Note: We recommend assigning the name "FIX Gateway Mirror Target" to network interface eth1 for clarity.

  2. Use m5.xlarge running Amazon Linux 2 for the instance.

  3. For security purposes:

    1. Do not assign a public IP to this instance.

    2. The Security Group for this instance must allow VXLAN Traffic (UDP Port 4789):

      VPC Flow 1

Once the instance is created, connect to it via SSH and enter the following command:

$ ip a

You should see output similar to the following:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 06:3c:5f:2c:ad:74 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.227/24 brd 10.0.0.255 scope global dynamic eth0
valid_lft 2366sec preferred_lft 2366sec
inet6 fe80::43c:5fff:fe2c:ad74/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 06:74:e5:1b:b3:fc brd ff:ff:ff:ff:ff:ff
inet 10.0.0.223/24 brd 10.0.0.255 scope global dynamic eth1
valid_lft 2545sec preferred_lft 2545sec
inet6 fe80::474:e5ff:fe1b:b3fc/64 scope link

This output confirms that both network interfaces are up and running.

Step 2: Define a Mirror Target

To define a new Mirror Target, go to the VPC Console and select the eth1 interface of the of newly created instance.

Give clear names to the interfaces to avoid confusion.

VPC Flow 2

Step 3: Define a Traffic Mirror Filter

Next, define a traffic mirror filter to capture the traffic that we want. Typically, a FIX gateway allocates a single port for each client, and each gateway reserves port ranges for existing and future clients. In this step, we define separate rules for market data and order entry port ranges.

Use the filter wisely and avoid sending too much traffic to the target machine.

VPC Flow 3

As shown in the diagram below, inbound traffic is limited to destination ports 10001-10200 and 12001-12200, which correspond to the port ranges of the Market Data and Order Entry gateways in our system. Note that the capture filter is not a firewall ACL and captures traffic coming from anywhere (0.0.0.0/0).

Step 4: Traffic Mirror Session

For the final step, we need to associate our traffic source with a traffic target and use the filter we defined in the previous step.

Make sure to use a network interface used by FIX Gateway instance as a traffic source.

Step 5: Quick test

Run tcpdump on the traffic capture host to make sure that everything works:

$sudo tcpdump -A -i eth1

.]=..\]C8=FIX.4.4.9=0790.35=W.49=DELTIX.56=DUSER173.52=20190930-21:11:51.955.34=6460833.262=16.55=XRPUSD.167=FOR.268=22.269=2.270=1.26886.271=6.5797=1.269=2.270=1.28393.271=2.5797=1. 269=2.270=1.29548.271=17.5797=1.269=2.270=1.29943.271=13.5797=1.269=2.270=1.30947.271=4.5797=1.269=2.270=1.31833.271=17.5797=1.269=2.270=1.29351.271=8.5797=1.269=2.270=1.31833.271=8. 5797=1.269=2.270=1.31945.271=44.5797=1.269=2.270=1.31946.271=31.5797=1.269=2.270=1.31946.271=30.5797=1.269=2.270=1.31953.271=54.5797=1.269=2.270=1.2306.271=3.5797=2.269=2.270=1.25386 .271=56.5797=2.269=2.270=1.25874.271=32.5797=1.269=2.270=1.29855.271=4.5797=1.269=2.270=1.29391.271=34.5797=2.269=2.270=1.29391.271=42.5797=2.269=2.270=1.28673.271=16.5797=2.269=2.27 0=1.28119.271=24.5797=1.269=2.270=1.28119.271=1.5797=1.269=2.270=1.29855.271=32.5797=1.10=049.8=FIX.4.4.9=1011.35=W.49=DELTIX.56=DUSER173.52=20190930-21:11:51.956.34=6460834.262=9.55 =BTCUSD.167=FOR.268=29.269=2.270=1.29851.271=7.5797=1.269=2.270=1.28556.271=37.5797=2.269=2.270=1.28556.271=5.5797=2.269=2.270=1.27559.271=17.5797=2.269=2.270=1.29851.271=61.5797=1.2

The most common errors include:

  • Using the wrong network interfaces
  • Using an incorrect capture filter
  • Not allowing VXLAN traffic in the security group of the capturing instance

Further steps

Note that captured traffic will have some additional TCP headers (VXLAN overlays).

Advanced topics:

  • Maximum packet size is 8946 bytes.
  • AWS prioritizes production traffic (a normal amount of traffic) over mirrored traffic. In the case of network congestion, mirrored traffic can be delayed (and dropped).
  • There are some Amazon Marketplace solutions available for reporting and analyzing captured traffic.
  • Amazon allows routing mirrored traffic to a network load balancer and using auto-scaling for monitoring instances.

A good AWS traffic mirroring video can be found here.

Appendix C: Monitoring Counters

This section provides a brief list of metrics that are useful when monitoring overloads. For a complete list of monitoring metrics see Monitoring Ember Metrics.

Common metrics (for MarketDataGateway and TradeGateway) published to Ember Monitor include:

  • Transp.IdleCycles - The number of times when the transport thread finishes its execution cycle with zero work done.
  • Transp.ActiveCycles - The number of times when the transport thread finishes its execution cycle with non-zero work done.
  • Transp.ActiveTime - The total time spent on work on the transport layer. This metric must be enabled by the gateway configuration option measureActiveTime = "true". It is turned off by default because it calls System.nanoTime() each time when we switch between active and idle states. It is not recommended to turn on this counter on if you use more than one MDG in same Ember instance.
  • Transp.SendQueueSize - The maximum observed size of the outbound queue over 1-second interval on the side of the consumer (transport layer).

MarketDataGateway-specific metrics:

  • Server.MDPublishDelayBackPressure – This metric reports the number of times the MDG delayed sending market data to clients due to back pressure (when outbound message queue is at least half full). If you observe growth in this metric, then the Market Data Gateway is unable to send data as fast as it is configured (see the minimumUpdateInterval setting).
  • SubsWaitingForSnapshotAdded - Number of times when incremental subscriptions over all symbols started to wait for snapshot (because they are just subscribed or got affected by backpressure)
  • SubsWaitingForSnapshotRemoved - Number of times when incremental subscriptions over all symbols stopped to wait for snapshot (because the client unsubscribed or snapshot was sent)
  • SubsForcedToWait - Total number of events when a subscription got switched into "waiting for snapshot state" because of backpressure