DBDB.io The Encyclopedia of Database Systems · Est. 2017
Database of Databases

Database Entry

WiredTiger


WiredTiger is an open-source key/value storage engine for NoSQL databases and is currently the default storage engine for MongoDB.

Source Code
https://github.com/wiredtiger/wiredtiger[02]
Developer
Country of Origin
AU, US
Start Year
2008 [21]
Acquired By
Coding Agents
Project Types
Commercial, Open Source
Written in
C
Supported Languages
C, Java, Python
Inspired By
LevelDB
Compatible With
LevelDB, RocksDB
Operating Systems
Linux, Windows
Licenses
GPL v2, GPL v3

WiredTiger supports both B-tree and log-structured merge tree for data storage. It also supports both row store and column store. WiredTiger employs multi-version concurrency control for better scalability on multi-core architectures and supports several standard isolation levels. Though WiredTiger is often considered as a storage engine, it also has some basic schema support where schemas are provided by users at runtime.

WiredTiger is developed in C and provides Java/Python language bindings. It has a set of APIs that is compatible with LevelDB.

Database Entry

WiredTiger


WiredTiger is an open-source key/value storage engine for NoSQL databases and is currently the default storage engine for MongoDB.

WiredTiger supports both B-tree and log-structured merge tree for data storage. It also supports both row store and column store. WiredTiger employs multi-version concurrency control for better scalability on multi-core architectures and supports several standard isolation levels. Though WiredTiger is often considered as a storage engine, it also has some basic schema support where schemas are provided by users at runtime.

WiredTiger is developed in C and provides Java/Python language bindings. It has a set of APIs that is compatible with LevelDB.

History[05]


The development of the WiredTiger storage engine started in November 2008. The corresponding company WiredTiger Software was founded in 2012 by Keith Bostic (also founder of BerkeleyDB) and Michael Cahill, and was later acquired by MongoDB to replace its original mmap-based storage engine as the default storage engine.

Checkpoints[06][07]


Checkpoints in WiredTiger are different from typical checkpoints in traditional relational database systems. Essentially, a WiredTiger table consists of multiple ready-only checkpoints and a writable live tree. Checkpoints are done periodically or be manually triggered. During a checkpoint process, dirty pages in a table are flushed to disk and a new checkpoint is created. Checkpoints are ordered by timestamp and therefore can support time-travel queries. Checkpoints are deleted only when there are no running transactions referencing it. Copy-on-write techniques are used to reduce overhead of creating new checkpoints.

A successful checkpoint in WiredTiger involves four phases. We will take B-trees as an example in the following discussion of these phases. First, all dirty leaf pages are flushed to disk. Then all dirty internal pages are written to disk. Then fsync is called on the checkpoint file so that the checkpoint is persistent to disk. Finally, the root of the B-tree is atomically changed to the new root. In this way, the old root becomes a checkpoint and the new root becomes the new "live tree". This is somewhat similar to shadow paging. When a checkpoint is deleted, WiredTiger will check whether there is any dependency between the two adjacent checkpoints and merge them if there are common pages in the two B-trees.

Compression[08][09]


Multiple compression techniques are used in WiredTiger.

Block-level: Existing compression libraries can be directly plugged into WiredTiger. It supports LZ4, snappy, zlib, and Zstd.

Record-level: WiredTiger supports dictionary encoding, run-length encoding, prefix encoding, and Huffman encoding.

Concurrency Control[10]


WiredTiger employs multi-version concurrency control (MVCC), in which readers never block writers. In case of write-write conflict, one transaction will be rolled back and retried. In addition, transactions' working set is required to fit in main memory in WiredTiger.

In WiredTiger, transactions can pin particular versions of the database, which are called "named snapshots". Queries within such transactions are executed as if they are operating on the named snapshots.

Data Model


WiredTiger typically servers as a NoSQL storage engine similar to LevelDB and RocksDB. But it also supports a thins schema layer above the storage layer.

Foreign Keys


Indexes[11]


B-tree: WiredTiger uses B-tree for in-memory indexes. Hazard pointers are used in their B-tree implementation, which points to file offsets on disk files. The checksum of a child page, as well as the child page's pointer are stored in the parent page to avoid potential memory flip issues.

Log-structured merge tree: LSM trees are used in WireTiger to improve write throughput. Each LSM tree has a corresponding Bloom filter so that unnecessary disk access can be avoided.

Isolation Levels[12]


Three isolation levels are provided: READ_UNCOMMITTED, READ_COMMITTED, and SNAPSHOT. READ_COMMITTED is the default isolation level.

Joins[13]


Nested loop joins in WiredTiger are supported by join cursors. When a join cursor is created, it creates corresponding iterators on tables to be joined. Indexes can be used to narrow down the range of iterators for WHERE clauses in joins.

Logging[14]


Durability in WiredTiger is often done by checkpointing. However, additional logging (journalling) is also supported. In WiredTiger, log records for writes are flushed to disk when a transaction commits. Group commits are supported for faster logging.

WireTiger uses physical logging, where each log record represents a single write operation. Each log record contains: transaction id, operation type, file id, operation key, and operation value.

Query Compilation


Query Execution[15]


Query Interface[16][17]


WiredTiger has a command line tool that supports basic read/write functionalities. WiredTiger provides language bindings for C/Python/Java and databases are accessed via cursors. It also provides optional asynchronous query APIs with callback mechanism if high throughput is desired.

Storage Architecture[18]


WiredTiger's storage architecture is optimized for multi-core CPUs and large memory. Its hybrid architecture contains two key components: in-memory cache and disk block manager. WiredTiger's in-memory cache is implemented in B-tree with hazard pointers, where B-tree nodes are organized in page granularity and cache eviction policy is LRU. The block manager is responsible for raw block I/O operations such as read, write, sync, compression and checkpoints. Blocks are managed in a skip list for efficient indexing and allocation.

Storage Model[19]


WiredTiger supports both row store and column store. Row store has better write performance whereas column store provides better read performance and low memory usage. WiredTiger supports mixed storage model for a particular table, e.g. a column-store table with a row-store index.

Storage Organization


WiredTiger supports both heap files and log-structured storage. The storage organization for a table is specified when a table is created. For heap file storage, B-tree is used as the index data structure and on-disk blocks are maintained by WiredTiger's block manager. For log-structured storage, a specialized LSM tree manager is used to manage data in the LSM tree.

Stored Procedures


System Architecture[20]


WiredTiger typically serves as a pluggable key-value storage engine for NoSQL databases, but also supports schemas and multi-version concurrency control. There is a thin schema layer in WiredTiger, in which the structure of a table is provided at runtime. Right below the schema layer, there is an in-memory caching layer, which heavily uses lock-free data structures (such as lock-free skip list and B-tree) and is optimized for modern multi-core architectures. Below the caching layer is a block storage manager, where log-structured merge trees are used to improve write throughput and reduce write amplification.

Views


Citations

23 sources
  1. https://www.mongodb.com mongodb.com
  2. GitHub - wiredtiger/wiredtiger: WiredTiger's source tree · GitHub github.com
  3. WiredTiger: Reference Guide wiredtiger.com
  4. WiredTiger - Wikipedia wikipedia.org
  5. https://www.mongodb.com/press/wired-tiger mongodb.com Dead — Check Archive
  6. Checkpoints · wiredtiger/wiredtiger Wiki · GitHub github.com
  7. A Technical Introduction to WiredTiger | PPTX slideshare.net
  8. WiredTiger: File formats and compression wiredtiger.com
  9. WiredTiger: Compressors wiredtiger.com
  10. WiredTiger: Transactions wiredtiger.com
  11. WiredTiger: Log-Structured Merge Trees wiredtiger.com
  12. WiredTiger: Transactions wiredtiger.com
  13. WiredTiger: Join cursors wiredtiger.com
  14. WiredTiger: Log cursors wiredtiger.com
  15. WiredTiger: Cursors wiredtiger.com
  16. WiredTiger: WT_CONNECTION Struct Reference wiredtiger.com
  17. WiredTiger: WiredTiger command line utility wiredtiger.com
  18. Block Manager Overview · wiredtiger/wiredtiger Wiki · GitHub github.com
  19. WiredTiger: WiredTiger Architecture wiredtiger.com
  20. A Technical Introduction to WiredTiger | PPTX slideshare.net
  21. First real build infra-structure -- I can now configure and build a static library with a single file. --HG-- branch : keith github.com
  22. https://github.com/wiredtiger/wiredtiger/commit/cd100c2adc89ea0952a923f00da9be0a831dda28 github.com
  23. https://github.com/wiredtiger/wiredtiger/commit/85f8c47e64c057a98f9bd0995485786c9036f81a github.com
Revision #25 Last Updated: