Cubrick is a distributed multidimensional in-memory DBMS developed for internal use at Facebook. It is designed for low-latency realtime OLAP analysis over large datasets.
- Developer
- Country of Origin
- US
- Start Year
- 2016
- Project Type
- Industrial Research
- License
- Proprietary
Cubrick is a distributed multidimensional in-memory DBMS developed for internal use at Facebook. It is designed for low-latency realtime OLAP analysis over large datasets.
Compression[02]
String fields in Cubrick are dictionary encoded, for both dimensions (i.e., indices) and metrics (i.e., values). Internally, Cubrick processes string fields using their encoded integers, and only converts them back when returning the results to the users.
Cubrick also uses BESS (Bit-Encoded Sparse Structure) encoding for compressing the multidimensional index for each cell (i.e., a group of metrics corresponding to the same dimension).
Data Model[02]
Cubrick stores data in bricks (i.e., partitions) in a column-wise fashion. In each brick, each column has a dynamic vector to store the metrics or the BESS encoded indices. Cells in a brick are unordered, and they are only appended to the end of the brick in the data ingestion.