Kafka

RBHQ uses Kafka to stream data

Apache Kafka is an open-source distributed event streaming platform used for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications

The platform is well documented so we recommend going through the official "Getting Started" page if you are not already familiar with Kafka

https://kafka.apache.org/documentation/#gettingStarted

https://kafka.apache.org/quickstart

Key Concepts

The key concepts relevant to consuming streamed data from Kafka are

Events

An event records the fact that "something happened" in the world or in your business. It is also called record or message in the documentation. When you read or write data to Kafka, you do this in the form of events. These events are published by RBHQ as JSON documents

Topics

Events are organized and stored durably in topics. In simplified terms, a topic is similar to a folder in a filesystem, and the events are the files in that folder.

There will be three topics containing streamed data for

  1. Prices, scratchings/unscratchings and deductions
  2. Race status
  3. Race results

Brokers

These refer to the actual servers hosting the topics. Multiple brokers can be combined to create a fault tolerant Kafka cluster

Consumer API

The Consumer API allows applications to read streams of data from topics in the Kafka cluster

The list of client libraries can be found here

https://cwiki.apache.org/confluence/display/KAFKA/Clients

Integration notes

Broker Endpoints

These are the endpoint that the producer and consumer APIs connects to.

Authentication and Security

Data transferred between brokers and clients is secured using SSL encryption and IP whitelists.
The authentication mechanism is a username/password combination using SASL/SCRAM-SHA-512

The broker endpoints, the authentication protocols, credentials and the target topic names are standard connection parameters that should be configurable within the respective client libraries

Networking

The broker endpoint's port is 9196 so make sure your network is configured correctly to allow outbound connections from this port

The required connections details as well as the names of the three topics will be provided separately once we receive a list of IP addresses of the connecting clients