DBDB.io The Encyclopedia of Database Systems · Est. 2017
Database of Databases

Database Entry

AresDB


AresDB is a GPU-based real-time analytics storage and query engine with low memory overhead, real-time upserts with primary key deduplication, and time series aggregations on both streaming and finite dimensional data.[01]

Source Code
https://github.com/uber/aresdb[02]
Country of Origin
US
Start Year
2018 [07]
Project Type
Open Source
Written in
C, C++, Go
Operating System
Linux
License
Apache v2

Database Entry

AresDB


AresDB is a GPU-based real-time analytics storage and query engine with low memory overhead, real-time upserts with primary key deduplication, and time series aggregations on both streaming and finite dimensional data.[01]

History[01]


Developed by Uber to meet their specific need "to make similar queries over relatively small, yet highly valuable, subsets of data (with maximum data freshness) at high QPS and low latency," with queries such as time series aggregations over geofences.

Compression


Data Model


Hardware Acceleration


GPU

Joins[01]


Logging[04]


Log files contain description of database upserts which must be replayed to rebuild the database after a crash.

Parallel Execution[05]


Executes queries with the one operation per kernel (OOPK) model.

Query Execution


AresDB works with vector batches that are efficiently processed in parallel using the Thrust library.

Query Interface[06]


Uses a proprietary execution language called Ares Query Language (AQL) which is based in the JSON format.

Storage Architecture


Data within the archival delay of a table is kept uncompressed in live batches, while everything else is stored in compressed archival batches. If new data is ingested that is outside the archival array, it's added to an archival backfill queue which will be inserted into the archived batches asynchronously.

Storage Model


System Architecture[01]


The CPU is only used to load information from storage into CPU memory and to distribute this data to GPU memory. The database system delegates each operator in a query to some GPU, so it's able to handle multiple GPUs by delegating different operations to different GPUs, each of which have completely separate memory. There are plans to implement proper distributed designs, but currently we're limited to a single system with multiple GPUs.

Citations

7 sources
  1. https://eng.uber.com/aresdb uber.com Dead — Check Archive
  2. GitHub - uber/aresdb: A GPU-powered real-time analytics storage and query engine. · GitHub github.com
  3. Home · uber/aresdb Wiki · GitHub github.com
  4. Redo Logs · uber/aresdb Wiki · GitHub github.com
  5. Query Execution · uber/aresdb Wiki · GitHub github.com
  6. Ares Query Language · uber/aresdb Wiki · GitHub github.com
  7. travis yml setup github.com
Revision #7 Last Updated: