DBDB.io The Encyclopedia of Database Systems · Est. 2017
Database of Databases

Database Entry

AresDB


AresDB is a GPU-based real-time analytics database with low memory overhead, real-time upserts with primary key deduplication, and time series aggregations on both streaming and finite dimensional data, including geofences.[01]

Source Code
https://github.com/uber/aresdb[02]
Country of Origin
US
Start Year
2018 [11]
End Year
2020
Project Type
Open Source
Written in
C, C++, Go
Operating System
Linux
License
Apache v2

Database Entry

AresDB


AresDB is a GPU-based real-time analytics database with low memory overhead, real-time upserts with primary key deduplication, and time series aggregations on both streaming and finite dimensional data, including geofences.[01]

History


Uber began to develop AresDB to replace Elasticsearch as their analytical database, as Elasticsearch used inverted indexes that weren't optimized for Uber's "time range-based storage and filtering," had a lot of unnecessary overhead due to using JSON files for storage, and was JVM-based, meaning it "[did] not support joins and its query execution runs at a higher memory cost." Uber decided to accelerate AresDB with GPUs because they expect GPUs' higher core count, 'greater computational throughput", and "greater compute-to-storage (ALU to GPU global memory) data access throughput (not latency) compared to [CPUs]," will further speed up their analytical queries.

The project was last updated in 2020 and appears to be abandoned.

Checkpoints[04]


Snapshots are triggered by either a certain number of mutations or a certain time frame specific to each table.

Compression[01]


AresDB only compresses data with user defined sort orders that have low cardinality.

Data Model[01]


Foreign Keys[05]


Hardware Acceleration[01]


GPU

AresDB uses GPUs for its query execution.

Indexes[01]


AresDB uses Hash Tables primarily for primary key deduplication.

Isolation Levels[06]


AresDB only provides transaction atomicity and isolation at the record level.

Joins[05][01]


AresDB supports hash joins from fact tables (finite set data such as cities) to dimension tables (infinite streaming data such as rides). The database also supports geospatial joins (i.e. geographically bounded area overlap) and normal foreign key joins. Note that AresDB uses late materialization for its joins, meaning the join may not be executed until a foreign key is accessed.

Logging[07]


Log files contain description of database upserts which must be replayed to rebuild the database after a crash.

Parallel Execution[08]


Executes queries with the one operation per kernel (OOPK) model.

Query Compilation[01]


Query Execution[08]


AresDB works with vector batches that are efficiently processed in parallel using the Thrust library.

Query Interface[05]


AresDB uses a proprietary execution language called Ares Query Language (AQL) which is based in the JSON format, making it compatible with any language that can handle files and/or JSON.

Storage Architecture[01]


Both in memory and on disk, data within the archival delay of a table (i.e. some time duration specified for each table) is kept uncompressed in live batches, while everything else is stored in compressed archival batches. If new data is ingested that is outside the archival delay, it's added to an archival backfill queue which will be inserted into the archived batches asynchronously.

Storage Model[01]


AresDB stores data in columnar vectors with an associated null vector and allows for partial tuple updates.

Storage Organization[09][10]


Archived data is sorted in a user specified column order, and files are organized by UTC day and Unix time cutoffs.

Stored Procedures


System Architecture[01]


The CPU is only used to load information from storage into CPU memory and to distribute this data to GPU memory. The database system delegates each operator in a query to some GPU, so it's able to handle multiple GPUs by delegating different operations to different GPUs, each of which have completely separate memory.

Views[05]


Citations

11 sources
  1. https://eng.uber.com/aresdb uber.com Dead — Check Archive
  2. GitHub - uber/aresdb: A GPU-powered real-time analytics storage and query engine. · GitHub github.com
  3. Home · uber/aresdb Wiki · GitHub github.com
  4. Data Snapshot · uber/aresdb Wiki · GitHub github.com
  5. Ares Query Language · uber/aresdb Wiki · GitHub github.com
  6. Data Ingestion · uber/aresdb Wiki · GitHub github.com
  7. Redo Logs · uber/aresdb Wiki · GitHub github.com
  8. Query Execution · uber/aresdb Wiki · GitHub github.com
  9. Data Archiving · uber/aresdb Wiki · GitHub github.com
  10. Data Layout On Disk · uber/aresdb Wiki · GitHub github.com
  11. travis yml setup github.com
Revision #20 Last Updated: