DBDB.io The Encyclopedia of Database Systems · Est. 2017
Database of Databases

Database Entry

TiKV


TiKV is an open source distributed Key-Value database which is based on the design of Google Spanner and HBase, but it is much simpler without dependency on any distributed file system. It's has primary features including Geo-Replication, Horizontal scalability, Consistent distributed transactions, Coprocessor support.

Source Code
https://github.com/tikv/tikv[01]
Developer
Country of Origin
CN
Start Year
2016
Project Type
Open Source
Written in
Rust
Supported Languages
Go, Java
Derived From
RocksDB
Operating System
Linux
License
Apache v2

Database Entry

TiKV


TiKV is an open source distributed Key-Value database which is based on the design of Google Spanner and HBase, but it is much simpler without dependency on any distributed file system. It's has primary features including Geo-Replication, Horizontal scalability, Consistent distributed transactions, Coprocessor support.

History


Inspired by Google Spanner, PingCAP started developing TiKV in 2015, and released the first version of TiKV along with TiDB in 2016. Up to April 27, 2018, PingCAP has released TiDB/TiKV 2.0, which has a lot new features and great performance gains compared to 1.0.

Checkpoints


Didn't find much information about this.

Concurrency Control[03]


TiKV has a Timestamp Oracle(TSO) to provide globally unique timestamp. The core transaction model of TiKV is called 2-Phase Commit powered by MVCC. There are two stages within each transaction:

  • PreWrite:
  • Create a startTS timestamp. Select one row as the primary row and others as secondary rows.
  • Check whether there are locks on this row or whether there are commits after the startTS. If conflicts exists, the transaction will be rollback. If not, lock the row.
  • Repeat the second step on other rows.
  • Commit:
  • Write to the CF_WRITE with current timestamp commitTS.
  • Release all the locks.

Data Model[04]


TiKV uses Key-Value model and the Key-Value pairs are ordered according to the Key's binary sequence.

Foreign Keys


Simple K-V store does not support foreign keys.

Isolation Levels[05]


TiDB/TiKV uses the Percolator transaction model. The default isolation level in TiKV is Repeatable Read. When a transaction starts, there will be a global read timestamp; when a transaction commits, there will be a global commit timestamp. The execution order of transactions is confirmed based on the timestamps. The underlying details can be found in the Concurrency Control section.

Joins


Join can be implemented in application level.

Logging


TiKV uses Raft to replicate data and each data change will be recorded as a Raft log. Through the log replication function of Raft, data is safely and reliably synchronized to multiple nodes of the Raft group

Query Compilation


Query Execution


Didn't find much information on it.

Query Interface[06]


TiKV support queries such as simple Key-Value, transactional Key-Value and push-down. But no matter it’s transactional Key-Value or push-down, it will be transformed to simple Key-Value operations in TiKV.

Storage Architecture[07]


TiKV does not write data to disk directly, instead it uses RocksDB as it underlying storage engine, where RocksDB can be regard as a standalone Key Value Map.

Storage Model[08]


TiKV uses Key-Value model and the Key-Value pairs are ordered according to the Key's binary sequence.

Stored Procedures[09]


System Architecture[10][11]


TiKV is built on top of RocksDB, where all data in a TiKV node shares two RocksDB instances. One is for data, and the other is for Raft log. There are some major components in TiKV:

  • Placement Driver (PD): Manages the metadata about Nodes, Stores, Regions mapping, and makes decisions for data placement and load balancing.
  • Node: A physical node in the cluster. Each node contains one or more Stores.
  • Store: Stores data in local disks using RocksDB. Each store contains one or more regions.
  • Region: The basic unit of Key-Value data movement and corresponds to a data range in a Store. Each Region is replicated to multiple Nodes and form a Raft group. A replica of a Region is called a Peer.

Views[09]


Compatible Systems
SurrealDB SurrealDB
Derivative Systems
HoraeDB HoraeDB

Citations

11 sources
  1. https://github.com/tikv/tikv github.com
  2. https://www.pingcap.com/docs pingcap.com Dead — Check Archive
  3. Rust in TiKV | TiDB pingcap.com
  4. TiDB Internal (I) - Data Storage | TiDB pingcap.com
  5. https://www.pingcap.com/docs/sql/transaction-isolation#tidb-transaction-isolation-levels pingcap.com Dead — Check Archive
  6. Rust in TiKV | TiDB pingcap.com
  7. TiDB Internal (I) - Data Storage | TiDB pingcap.com
  8. Rust in TiKV | TiDB pingcap.com
  9. https://www.pingcap.com/docs/sql/mysql-compatibility#unsupported-features pingcap.com Dead — Check Archive
  10. Rust in TiKV | TiDB pingcap.com
  11. Rust in TiKV | TiDB pingcap.com
Revision #7 Last Updated: