A High-Level Overview¶
For the application developer, there are quite a few barriers to entry for those wishing to write market-data-driven applications. A developer would probably need to:
- Either write their own uploader client, or re-purpose an existing uploader.
- Write a service/API to accept the data in whatever format(s) the uploader(s) they’d like to support use.
- Actually get players to point their uploader at said service/API (This is the hardest part!)
- Probably pull data from other market sites to flesh out their data set.
- Then, and only then, start writing the fun part of their application as the amount of data coming in slowly grows. By this point, they are probably a ways down the road to burnout.
None of these tasks are fun, they all involve re-inventing the wheel. By the time the developer gets through all of this (if they do), burnout is a distinct possibility. All before getting to the fun stuff!
EVE Market Data Relay (EMDR) allows you to forgo all of this drudgery, and instead, connect to a firehose of data in the standardized Unified Uploader Data Interchange Format format. EMDR’s ZeroMQ underpinnings also make it easier, and exponentially more efficient than accepting HTTP uploads directly.
During the early design and development of EMDR, these were the main pillars we built on:
- There should be no single point of failure. Every component of the architecture should be simple to make redundant using trusted volunteered machines.
- The application must be able to accept an extremely large number of incoming market orders without performance issues.
- The cost for people hosting parts of EMDR’s network should be kept to an absolute minimum. This means being stingey with CPU, RAM, and bandwidth. Likewise, consuming the feed shouldn’t break the bank, either.
- It must be very easy to scale the system without restarts/reconfigs on the primary setup.
- The broadcasting of the market data needs to happen in a “fan out” manner. In this way, we can keep adding additional subscribers without running into scalability issues.
How it all fits together¶
For any given submitted market order, here is the flow said order goes through:
(Gateway) -> (Announcer) -> (Relays) -> (Applications)
First, the order hits the Gateway, which is a simple HTTP application that parses the message. Incoming messages are in Unified Uploader Data Interchange Format.
The Gateway interprets the message, validates it, normalizes anything weird, then pipes it to all of the root-level Announcers in the network.
The Announcer is the first tier of our market data distribution. Announcers relay any data they receive to Relays that are connected to the Announcer. There are only a few Announcers, and these only accept connections from approved Relays. Most relays connect to multiple announcers for added redundancy.
The Relay, like the Announcer, is a dumb repeater of everything it receives. Relays receive data from their Announcers, then pipe it out to any subscribers that are connected to them. Subscribers can be other Relays, or actual user sites/applications.
By using our system of Relays, we keep bandwidth usage and costs lower on the top-level Announcers. We are also able to keep “fanning out” to improve redundancy and serve greater numbers of consumers without large increases in bandwidth utilization.
We are left with a very efficient, very sturdy data relay network. The next section goes into detail about fault-tolerance.
Security is something we take seriously, but let’s consider the current reality of market data with EVE sites: Players upload market data directly to market sites. We are no less secure than that. Uploads can be faked, and malicious payloads can be sent, though EMDR will do its best to catch anything harmful.
As a consumer, you may wish to cross-reference incoming data. In many cases, you will get the same data point multiple times, as several players upload the same thing. This can be used to your advantage.
This is the least interesting part of the overview, so it goes towards the ends.
- EMDR is written in Python.
- All network-related stuff is handled by ZeroMQ, which is an incredibly simple and performant networking library.
- gevent is used for their excellent greenlet-based Queue, Workers, and async network I/O.
- The gateway HTTP servers run bottle.
The entire stack is super low overhead, and very fast.