Cubrick

Cubrick is a distributed multidimensional in-memory DBMS developed for internal use at Facebook. It is designed for low-latency realtime OLAP analysis over large datasets.

Compression

Dictionary Encoding

String fields in Cubrick are dictionary encoded, for both dimensions (i.e., indices) and metrics (i.e., values). Internally, Cubrick processes string fields using their encoded integers, and only converts them back when returning the results to the users.

Cubrick also uses BESS (Bit-Encoded Sparse Structure) encoding for compressing the multidimensional index for each cell (i.e., a group of metrics corresponding to the same dimension).

Indexes

Hash Table

Data Model

Column Family / Wide-Column

Cubrick stores data in bricks (i.e., partitions) in a column-wise fashion. In each brick, each column has a dynamic vector to store the metrics or the BESS encoded indices. Cells in a brick are unordered, and they are only appended to the end of the brick in the data ingestion.

Cubrick Logo
Website

https://research.fb.com/cubrick-a-new-multidimensional-in-memory-dbms/

Developer

Facebook

Country of Origin

US

Start Year

2016

Project Type

Industrial Research

Licenses

Proprietary