LinDB uses a combination of bitmap encoding and data compression. Since it is designed for time-series data, all data entry contains a timestamp and several multi dimensional metrics, such as ip string, shard id, etc. The timestamp is saved with a base timestamp and a bitmap with each bit encoding a time within some time period following the base timestamp. The data blocks are compressed with xor, in reference to Gorilla.
Similar to other time-series DBMS, the data model contains four main components: timestamp, metrics, tags, and fields. The metric is the name of the measurement attribute. Tags are combinations of dimensions of measurement used during query, and the fields stores metric data. User has to define whether their columns belong to tags or fields.
LinDB is designed to support both fast insert and high throughput read, including mainly aggregation operations such as group by. Meanwhile, it needs distributed storage due to the size of the stored data.
The storage layer mainly stores data with write replication, with shard number configurable to user. Each cluster contains several servers, and data are redundantly stored across servers.
User interacts with the DBMS via brokers. The broker is in charge of receiving query requests, parsing queries, redirect child execute plan to storage layer, and aggregation if needed.
In addition, metadata of the clusters for broker layer and storage layer are stored by ETCD, an external dependency. It servers as a distributed configuration service in Go. The internal Java version LinDB use Zookeeper for the same functionality.
https://github.com/eleme/lindb
https://github.com/eleme/lindb/wiki
ELEME Inc.
2019