ForestDB is a single-node key-value storage engine Couchbase server. It addresses the performance issue on current B+ tree and LSM-tree indexing on variable length strings. ForestDB uses an in-memory log structured write buffer index and an on-disk HB+-trie as indexing. HB+-trie splits keys into chunks and stored in normal B+ tree nodes to guide the traffic. The write buffer index stores the disk locations of records not in HB+-trie and the log-structure of write buffer index reduces disk I/O.
The ForestDB project came from ACM SIGMOD 2011 programming contest. It is now under the development of Couchbase's Caching and Storage team to replace the previous Couchstore storage engine and plans to conduct optimizations for SSD, support reduce feature and allow users to pause/resume compaction.
Multi-version Concurrency Control (MVCC)
ForestDB supports Multi-version Concurrency Control and uses append-only manner to store records. Besides, its indices are also designed to be easily maintained in MVCC. The in-memory WB index is log-structured. All B+ tree nodes in the HB+-trie are also updated in append-only manner.
As a common key/value storage engine, ForestDB supports create
, read
, update
and delete
operations with a key/value pair and the metadata of keys. The maximum size of key and its metadata are both 64KB while that of value is 4GB. ForestDB supports retrieving metadata and the value by their corresponding key and the sequence number. It also supports retrieving metadata and the value of a record directly by its byte offset on disk.
All records (documents) are stored in log-structured on disk. All updates are appended at the end of files on the disk. Similarly, all nodes of the HB+-trie index are also stored on disk. When there are updates on the HB+-trie index, those updates are also appended at the end of the file as new nodes in the HB+-trie. After all updates on the index are appended, a new block storing pointers to the new nodes is appended at the end of the file.
N-ary Storage Model (Row/Record)
In ForestDB, a file on disk contains multiple blocks. Within each block, ForestDB uses NSM to organize records. Each record stores the value and the metadata of the key. Users can enable block alignment in ForestDB, which does not allow records to be stored across blocks. If the record size is larger than the size of a single block, ForestDB will compare the number of blocks required if the record were stored at the end of last record against if the record were stored at the beginning of the next block. ForestDB will choose the way requiring fewer blocks.
Read Uncommitted Read Committed
Currently, the forestDB only supports two isolation levels: READ COMMITTED
and READ UNCOMMITTED
.
ForestDB stores all mutations in append-only manner. For each insert and update, ForestDB directly appends the new key/value pair at the end of a file. For each delete, ForestDB marks the key/value pair as deleted. It uses periodical compaction to conduct limit the log size. For each compaction, all records (key/value pairs) that are committed will be written to a new file. Once the compaction completes, ForestDB replaces the old file with the new file.
https://github.com/couchbaselabs/forestdb
https://github.com/couchbase/forestdb
https://github.com/couchbase/forestdb/wiki
Couchbase, Inc.
2013
C++, Go, Java, Objective-C, Python
Android, iOS, Linux, OS X, Windows