TiDB

Viewing Revision #20 from 2022-04-02 18:22 View Current

TiDB is an open-source distributed database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability. The goal of TiDB is to provide users with a one-stop database solution that covers OLTP (Online Transactional Processing) and Real-time Analytics. TiDB is suitable for various use cases that require high availability, strong consistency and real-time analytics with large-scale data.[01]

Logo Versions

Website: https://www.pingcap.com[01]
Source Code: https://github.com/pingcap/tidb[02] Accessed: Jul 22, 2026 Last Commit: Jul 22, 2026
Tech Docs: https://docs.pingcap.com[03]
Developer: PingCAP
Country of Origin: CN
Start Year: 2015 [02]
Project Types: Commercial, Open Source
Written in: Go
Supported Languages: C, C++, Cocoa, D, Eiffel, Erlang, Go, Haskell, Java, Lua, Ocaml, Perl, PHP, Python, Ruby, Scheme, SQL, Tcl
Embeds / Uses: TiKV
Inspired By: Cloud Spanner
Compatible With: MySQL
Operating Systems: Hosted, Linux
License: Apache v2
Twitter: @PingCAP[05]
Wikipedia: https://en.wikipedia.org/wiki/TiDB[04]

Logo Versions

Website: https://www.pingcap.com[01]
Source Code: https://github.com/pingcap/tidb[02] Accessed: Jul 22, 2026 Last Commit: Jul 22, 2026
Tech Docs: https://docs.pingcap.com[03]
Developer: PingCAP
Country of Origin: CN
Start Year: 2015 [02]
Project Types: Commercial, Open Source
Written in: Go
Supported Languages: C, C++, Cocoa, D, Eiffel, Erlang, Go, Haskell, Java, Lua, Ocaml, Perl, PHP, Python, Ruby, Scheme, SQL, Tcl
Embeds / Uses: TiKV
Inspired By: Cloud Spanner
Compatible With: MySQL
Operating Systems: Hosted, Linux
License: Apache v2
Twitter: @PingCAP[05]
Wikipedia: https://en.wikipedia.org/wiki/TiDB[04]

Derivative Systems

PR PranaDB

TiDB

Viewing Revision #20 from 2022-04-02 18:22 View Current

History[06]

TiDB is inspired by the design of Google F1 and Google Spanner, and it supports features like horizontal scalability, strong consistency, and high availability.

Checkpoints[07]

Non-Blocking Consistent

TiDB provides consistent checkpoint without blocking. Users can start a transaction and dump all the data from any table. TiDB also provides a way to get consistent data from history versions. The tidb_snapshot system variable is introduced to support reading history data.

Concurrency Control[08]

Multi-version Concurrency Control (MVCC)

The history versions of data are kept because each update / removal creates a new version of the data object instead of updating / removing the data object in-place. But not all the versions are kept. If the versions are older than a specific time, they will be removed completely to reduce the storage occupancy and the performance overhead caused by too many history versions. In TiDB, Garbage Collection (GC) runs periodically to remove the obsolete data versions. GC is triggered in the following way: There is a gc_worker goroutine running in the background of each TiDB server. In a cluster with multiple TiDB servers, one of the gc_worker goroutines will be automatically selected to be the leader. The leader is responsible for maintaining the GC state and sends GC commands to each TiKV region leader.

Data Model[09]

Key-Value

TiDB uses TiKV as the underlying data storage engine, which uses the Key-Value model and can be seen as a huge distributed ordered Map that is of high performance and reliability.

Indexes[10]

TiDB uses TiKV, an open source distributed transactional key-value store, to implement the index. Logically, TiKV could be used as a giant ordered map. For a single instance of TiKV, TiDB uses RocksDB as the embedded Key-Value engine. RocksDB uses LSM-tree as its storage data structure.

Isolation Levels[11]

Read Committed Repeatable Read

TiDB uses the Percolator transaction model. A global read timestamp is obtained when the transaction is started, and a global commit timestamp is obtained when the transaction is committed. The execution order of transactions is confirmed based on the timestamps. Repeatable Read is the default transaction isolation level in TiDB.

Joins[12][13]

Hash Join

TiDB’s SQL layer currently supports 3 types of distributed join: hash join, sort merge join (when the optimizer thinks even the smallest table is too large to fit in memory and the predicates contain indexed columns, the optimizer would choose sort merge join) and index lookup join. With the columnar storage engine TiFlash, TiDB supports two more join algorithms: Broadcast Hash Join, Shuffled Hash Join

Logging[14]

Physical Logging

TiDB uses the Raft consensus algorithm for replication, so it has Raft log. And TiDB also provides binlog to export data from the TiDB cluster.

Query Compilation

Not Supported

Query Execution

Tuple-at-a-Time Model Vectorized Model

In most cases, TiDB processes data tuple by tuple. But in some cases, TiDB uses vectorized execution.

Query Interface[15]

SQL

TiDB supports SQL and MySQL dialect.

Storage Architecture[10]

Disk-oriented

Any durable storage engine stores data on disk and TiKV is no exception. But TiKV doesn’t write data to disk directly. Instead, it stores data in RocksDB and then RocksDB is responsible for the data storage. The reason is that it costs a lot to develop a standalone storage engine, especially a high-performance standalone engine.

Storage Model[16][10]

Custom

TiDB stores its data in the distributed key-value storage engine, TiKV. TiFlash is an extension of TiKV which stores data in columnar format to accelerate the analytical workloads.

Storage Organization

Log-structured

Stored Procedures[17]

Not Supported

System Architecture[18]

Shared-Nothing

The TiDB cluster has four components: the TiDB server, the PD server,the TiKV server and the TiFlash server.
- The TiDB server is stateless. It does not store data and it is for computing only. TiDB is horizontally scalable and provides the unified interface to the outside through the load balancing components such as Linux Virtual Server (LVS), HAProxy, or F5.
- The Placement Driver (PD) server is the managing component of the entire cluster.
- The TiKV server is responsible for storing data. From an external view, TiKV is a distributed transactional Key-Value storage engine. Region is the basic unit to store data. Each Region stores the data for a particular Key Range which is a left-closed and right-open interval from StartKey to EndKey. There are multiple Regions in each TiKV node. TiKV uses the Raft protocol for replication to ensure the data consistency and disaster recovery. The replicas of the same Region on different nodes compose a Raft Group. The load balancing of the data among different TiKV nodes are scheduled by PD. Region is also the basic unit for scheduling the load balance. - The TiFlash Server is a special type of storage server. Unlike ordinary TiKV nodes, TiFlash stores data by column, mainly designed to accelerate analytical processing.

Views[19]

Virtual Views

TiDB supports Views. Views in TiDB are non-materialized. This means that as a view is queried, TiDB will internally rewrite the query to combine the view definition with the SQL query.

Derivative Systems

PR PranaDB

Citations

19 sources

Database for AI Agents | TiDB Distributed SQL | TiDB pingcap.com Accessed: 2026-07-18
GitHub - pingcap/tidb: TiDB is built for agentic workloads that grow unpredictably, with ACID guarantees and native support for transactions, analytics, and vector search. No data silos. No noisy neighbors. No infrastructure ceiling. · GitHub github.com Accessed: 2026-05-27
Home | TiDB Docs pingcap.com Modified: 2026-06-05 Accessed: 2026-06-05
TiDB - Wikipedia wikipedia.org Modified: 2026-02-03 Accessed: 2026-06-04
https://twitter.com/PingCAP twitter.com
https://www.pingcap.com/docs/overview#tidb-introduction pingcap.com Dead — Check Archive Modified: 2026-05-14 Accessed: 2026-05-21
https://www.pingcap.com/docs/op-guide/history-read#reading-data-from-history-versions pingcap.com Dead — Check Archive Modified: 2026-05-14 Accessed: 2026-05-21
Rust in TiKV | TiDB pingcap.com Accessed: 2026-06-07
TiDB Internal (I) - Data Storage | TiDB pingcap.com Accessed: 2026-06-07
TiDB Internal (I) - Data Storage | TiDB pingcap.com Accessed: 2026-06-07
https://www.pingcap.com/docs/sql/transaction-isolation#tidb-transaction-isolation-levels pingcap.com Dead — Check Archive Modified: 2026-05-14 Accessed: 2026-05-21
TiFlash Overview - v6.1 | TiDB Docs pingcap.com Modified: 2026-02-14 Accessed: 2026-06-01
Explain Statements That Use Joins | TiDB Docs pingcap.com Modified: 2026-05-27 Accessed: 2026-06-01
https://www.pingcap.com/docs/tools/tidb-binlog-kafka#tidb-binlog-user-guide pingcap.com Dead — Check Archive Modified: 2026-05-14 Accessed: 2026-05-21
https://www.pingcap.com/docs/sql/mysql-compatibility#compatibility-with-mysql pingcap.com Dead — Check Archive Modified: 2026-05-14 Accessed: 2026-05-21
TiFlash Overview | TiDB Docs pingcap.com Modified: 2026-05-27 Accessed: 2026-06-01
https://www.pingcap.com/docs/sql/mysql-compatibility#unsupported-features pingcap.com Dead — Check Archive Modified: 2026-05-14 Accessed: 2026-05-21
https://www.pingcap.com/docs/overview#tidb-architecture pingcap.com Dead — Check Archive Modified: 2026-05-14 Accessed: 2026-05-21
https://pingcap.com/docs/v3.0/reference/sql/statements/create-view/ pingcap.com Dead — Check Archive Modified: 2026-05-22 Accessed: 2026-05-27

Revision #20 Last Updated: 2022-04-02 14:22