LinDB is a distributed time-series DBMS written in Go. It is optimized especially for real-time writing and retrieval tasks, and is used internally by ELEME Inc. The internal version is Java based, however open sourced version is redesigned in Go. It requires Zookeepers to store cluster metadata.[01]
- Source Code
- https://github.com/lindb/lindb[01]
- Developer
- Country of Origin
- CN
- Start Year
- 2019 [13]
- Project Type
- Open Source
- Written in
- Go
- Supported Languages
- SQL
- Embeds / Uses
- etcd
- Inspired By
- InfluxDB
- License
- Apache v2
LinDB is a distributed time-series DBMS written in Go. It is optimized especially for real-time writing and retrieval tasks, and is used internally by ELEME Inc. The internal version is Java based, however open sourced version is redesigned in Go. It requires Zookeepers to store cluster metadata.[01]
History[03]
Companies like ELEME Inc. need to monitor all its systems, which calls for DBMS capable of saving multi dimension monitoring metrics for several PB per day. Previously the company use graphite, however its performance downgrades when the dimension of the metrics increases.
LinDB is developed to address this demand. The project is launched at 2016 internally, and has been updated three versions. The open sourced version is a redesign and rewrite in Go since 2019, however is not in production stage yet.
Compression[04][03][05]
LinDB uses a combination of bitmap encoding and data compression. Since it is designed for time-series data, all data entry contains a timestamp and several multi dimensional metrics, such as ip string, shard id, etc. The timestamp is saved with a base timestamp and a bitmap with each bit encoding a time within some time period following the base timestamp. The data blocks are compressed with xor, in reference to Gorilla.
Concurrency Control[03]
Concurrency control is not supported. All writing to the DB is first directed to the specific shard based on its tagKey and tagValue. Within each shard all writing is handled by a single thread.
Data Model[06][07][08]
Similar to other time-series DBMS, the data model contains four main components: timestamp, metrics, tags, and fields. The metric is the name of the measurement attribute. Tags are combinations of dimensions of measurement used during query, and the fields stores metric data. User has to define whether their columns belong to tags or fields.
System Architecture[10][11][03][12]
LinDB is designed to support both fast insert and high throughput read, including mainly aggregation operations such as group by. Meanwhile, it needs distributed storage due to the size of the stored data.
The storage layer mainly stores data with write replication, with shard number configurable to user. Each cluster contains several servers, and data are redundantly stored across servers.
User interacts with the DBMS via brokers. The broker is in charge of receiving query requests, parsing queries, redirect child execute plan to storage layer, and aggregation if needed.
In addition, metadata of the clusters for broker layer and storage layer are stored by ETCD, an external dependency. It servers as a distributed configuration service in Go. The internal Java version LinDB use Zookeeper for the same functionality.
Citations
13 sources- GitHub - lindb/lindb: LinDB is a scalable, high performance, high availability distributed time series database. · GitHub github.com
- Home · lindb/lindb Wiki · GitHub github.com
- https://zhuanlan.zhihu.com/p/35998778 zhihu.com
- http://www.vldb.org/pvldb/vol8/p1816-teller.pdf vldb.org
- lindb/tsdb/memdb/metric_store_index.go at 438001fde0c8bca73f121721e6347e123250731a · lindb/lindb · GitHub github.com
- https://zhuanlan.zhihu.com/p/36804890 zhihu.com
- lindb/series/field at b245eee881603615a33309a3bd1f7733b383ab81 · lindb/lindb · GitHub github.com
- lindb/tsdb/memdb at 438001fde0c8bca73f121721e6347e123250731a · lindb/lindb · GitHub github.com
- https://github.com/lindb/lindb/tree/develop/sql/grammar github.com
- https://lindb.io/docs/design/architecture.html#overview lindb.io
- GitHub - etcd-io/etcd: Distributed reliable key-value store for the most critical data of a distributed system · GitHub github.com
- lindb/pkg/state at a3572de1b43d0136aa4be2e089f9c86fb3b76742 · lindb/lindb · GitHub github.com
- Initial commit github.com