DBDB.io The Encyclopedia of Database Systems · Est. 2017
Database of Databases

Database Entry

Atlas


Atlas is a dimensional time series data database management system for near real-time operational analytics.[02]

Source Code
https://github.com/Netflix/atlas[01]
Developer
Country of Origin
US
Start Year
2014 [03]
Coding Agent
Project Type
Open Source
Written in
Scala

Database Entry

Atlas


Atlas is a dimensional time series data database management system for near real-time operational analytics.[02]

History[02]


Prior to Atlas, Netflix used an in-house tool called Epic and a commercial system. This was done since there was a huge surge in the number of metrics Netflix was monitoring between 2011 and 2014. The number of metrics they monitored increased from 2 million to 1.2 billion. They decided on building a system from scratch to suit their requirements.

Compression


Atlas has 2 types of data 1) Tags - Although there is no compression done in memory for tags, string, and set deduplication is done to avoid redundancy. The older data is stored in S3 which is compressed. 2) Time series values - These are stored in a specific block format which may be compressed based on the data.

Foreign Keys


Since it is more like a search engine than a Relational Database, it does not have a need for foreign keys.

Indexes


Like Lucene, Atlas treats tags as documents and an inverted index is created for these tags to find the set of matching time series. The key building block for the inverted index is integer set implementation for which a roaring bitmap is used. Once a set of matches is found, the lookup for time-series data is effectively into a huge id - to - time series map.

Joins


Atlas does not support the full Relational Database style joins. But it does support simple joins in terms of binary math operations. Simple aggregates (like sum) generate a single value for a time series. If the time-series expression is grouped then there can be two cases, 1) If both the sides are grouped (they must be grouped on the same keys), then the corresponding entry on each side is found and the operation is applied to them 2) If only one of the time-series is grouped, then single result from the other side is applied to each result from the other side.

Logging


Since Atlas is not an ACID database it does not use a transactional log. The focus is on AP from the CAP theorem. Consistency is done in a best effort manner.

Query Interface


Atlas has a custom language (called Stack language) which is URL friendly, is easy to parse, and allows concatenative query rewrites which are used by tools to have generative templates with arbitrary queries.

Storage Architecture


Stored Procedures


There are some systems implemented on top of Atlas which have similar functionality implemented. This allows for higher flexibility.

Citations

4 sources
  1. GitHub - Netflix/atlas: In-memory dimensional time series database. · GitHub github.com
  2. Home · Netflix/atlas Wiki · GitHub github.com
  3. Initial commit github.com
  4. https://github.com/Netflix/atlas/commit/f269c137c2bb56e70639bb576ab1027556c1db41 github.com
Revision #4 Last Updated: