Atlas is a dimensional time series data database management system for near real-time operational analytics.[02]
- Source Code
- https://github.com/Netflix/atlas[01]
- Developer
- Country of Origin
- US
- Start Year
- 2014 [03]
- Project Type
- Open Source
- Written in
- Scala
Atlas is a dimensional time series data database management system for near real-time operational analytics.[02]
History[02]
Prior to Atlas, Netflix used an in-house tool called Epic and a commercial system. This was done since there was a huge surge in the number of metrics Netflix was monitoring between 2011 and 2014. The number of metrics they monitored increased from 2 million to 1.2 billion. They decided on building a system from scratch to suit their requirements.
Compression
Atlas has 2 types of data 1) Tags - Although there is no compression done in memory for tags, string, and set deduplication is done to avoid redundancy. The older data is stored in S3 which is compressed. 2) Time series values - These are stored in a specific block format which may be compressed based on the data.
Foreign Keys
Since it is more like a search engine than a Relational Database, it does not have a need for foreign keys.
Indexes
Like Lucene, Atlas treats tags as documents and an inverted index is created for these tags to find the set of matching time series. The key building block for the inverted index is integer set implementation for which a roaring bitmap is used. Once a set of matches is found, the lookup for time-series data is effectively into a huge id - to - time series map.
Joins
Atlas does not support the full Relational Database style joins. But it does support simple joins in terms of binary math operations. Simple aggregates (like sum) generate a single value for a time series. If the time-series expression is grouped then there can be two cases, 1) If both the sides are grouped (they must be grouped on the same keys), then the corresponding entry on each side is found and the operation is applied to them 2) If only one of the time-series is grouped, then single result from the other side is applied to each result from the other side.
Logging
Since Atlas is not an ACID database it does not use a transactional log. The focus is on AP from the CAP theorem. Consistency is done in a best effort manner.
Query Interface
Atlas has a custom language (called Stack language) which is URL friendly, is easy to parse, and allows concatenative query rewrites which are used by tools to have generative templates with arbitrary queries.
Stored Procedures
There are some systems implemented on top of Atlas which have similar functionality implemented. This allows for higher flexibility.