NutsDB

View Current Viewing Revision #4 from 11/24/2019 12:42 p.m.

NutsDB is a key/value store database written in Go. Like other database systems, NubsDB uses transaction to track all the operations. All transactions happened in NutsDB is fully serializable and to better store and manipulate data, it supports many different data structures such as list, set and sorted set. Transactions are categorized into types based on their level of access. In other words, there are read-only and read-write transactions. Multiple read-only transactions are allowed at the same time, and they could, of course, read values from a bucket using a key or traverse the buckets to find a range of key-value pairs. However, only one read-write transaction is allowed at the same time and it could read, update and delete from the data storage.

History

NutsDB is developed mainly by Jiajun Xu, currently a CTO for a Chinese tech company based in Hangzhou. The motivation is that the author wants a simpler, faster, and more persistent key/value store database system. The first commit is on Dec 7, 2018, and the development goes from that point till Aug 29, 2019, which is the latest commit. There are in total of 303 commits, and most of which are coming from the author himself. It currently has 948 stars and growing. It is not derived or forked from any other database systems. However, the author adopted Bitcask as the key/value store and retrieval model. The author further optimized it using B+ tree as index to compensate Bitcask's limitations in range and prefix scanning.

Storage Model

Custom

NutsDB used Bitcask to support key/value storage. There's no row or column in the database. There are, however, log records that are being appended to the log files.

Data Model

Key/Value

NutsDB supports key/value storage and all key/value pairs are stored in buckets, in which all keys must be unique. However, the same keys are allowed in different buckets.

NutsDB adopted Bitcask, which is a log-structure key/value storage system. All the updates and write operations are appended to the end of the log. It does not support random read and write. Rather, to delete a value, it simply append a new record marking the old value as deleted. To update, it will directly append the new value, and the system will use the newest value always, according to the timestamp. Therefore, the old values, even though still exist in the file, are simply discarded. An obvious drawback is that the log file will become huge after running for a while or the amount of data processed is huge, therefore causing the launch of NutsDB to be slow since it needs to process the log file first.

It also supports other data structures: list, set and sorted set. The APIs follows Redis commands.

Query Interface

Custom API

For ordinary key/value queries, NutsDB used its own APIs to insert, delete and update key/value pairs in the buckets. Since all the operations are monitored and organized by transactions, the API calls are made by transactions, or Tx. Tx.Put Tx.Get Tx.Delete are three operations that could be used to query from the database using key/value pairs.

For range and prefix scan, NutsDB supports Tx.PrefixScan and Tx.RangeScan, and further user has the option to limit the number of entries being returned.

For manipulating data structures such as List, Set and OrderedSet, NutsDB follows the traditions in the Redis commands.

Indexes

B+Tree

NutsDB uses Bitcask as a key/value store. This model is fast and reliable, but it also introduces some shortcomings in terms of the performance. Traditional key/value store does not support range and prefix searching. To perform a scan, it needs to query the keys one by one, which is not desirable. Therefore, the author optimized it by incorporating B+ tree indexing to support easy scan and searching. With B+ tree, range query and prefix matching could be performed with ease.

Upon launch, there are three modes of indexing to read the log files to reconstruct the database, varying in the amount of memory required. HintKeyValAndRAMIdxMode is the default mode, which is the most memory consuming but the most efficient one. This mode is most suitable for situations when the memory is not an issue. HintKeyAndRAMIdxMode is less memory consuming by storing values on disk and use the index to find the offsets of the values. This is most suitable when the number of values is much more than that of the keys. HintKeyValAndRAMIdxMode is the most memory efficient but the slowest one. It used multilevel indexing, which means it does not requires too much memory.

Storage Organization

Log-structured

The actual storage is following the Bitcask model with log-structured file tracking the changes of the database. It will keep a B+ tree indexing to search for the offset of the values stored on disk.

System Architecture

Embedded

Isolation Levels

Serializable

NutsDB is claimed to be serializable. Based on Read-Write locks and the exclusiveness between Read-Only and Read-Write operations, each transaction would be isolated and serialized. At a certain time, only one Read-Write operation is allowed, which guarantees no write conflicts.

Views

Virtual Views

NutsDB support all kinds of Views using db.View. In fact, there are only two kinds of operations supported, db.View and db.Update. The first one is responsible for all Read-Only tasks and the latter Read-Write tasks. To get a View, one could simply use tx.Get to do a key/value query, tx.RangeScan or tx.PrefixScan to do a range and prefix scan query, and tx.GetAll to get all the key/value pairs from all the buckets.

Storage Architecture

Hybrid

NutsDB has different modes of storage. In all modes, the log files will be used, which is on disk, to reconstruct database.

For the first mode, it is the pure in-memory, and it is the most efficient one. However, memory usage would be the bottleneck here, especially when you don't have enough.

The second mode is partially in-memory and partially disk-oriented. Under this mode, NutsDB will store the values on the disk and use the key to get the offset for the corresponding values. It is more memory efficient, but it is slower. This is the most suitable when the values significantly outnumber the keys.

The third mode adopted multi-level indexing on disk, so it is understandable that while it is the most memory efficient, it is the slowest.

Overall, the second and the third mode supports larger-than-memory databases, since they will be querying from the disk.

Logging

Logical Logging

Since NutsDB adopted Bitcask, it uses logging to record every single operation that made changes to the database. Insert Insertion of a data key/value pair is simply appending to the end of the logging file. If the log file results in a huge size (after running for a long time or a transaction that involves too many writes), it will start a new file but keep the old files. Delete Deletion is also simply appending to the end of the logging file by marking that value to be deleted. Update Update is also simply appending, and the database will keep track of the time stamp, since only the latest value is the valid ones. In other words, the values with old timestamps, even though still present in the logging file, will be discarded. Merge Logs Eventually, the log file will be too huge and NutsDB will initiate a merge operation to merge all the log files. It will remove all the duplicates, delete the values that are marked as deleted, and keep only the newest value with the latest timestamp during updates.

Revision #4 | Updated 11/24/2019 12:42 p.m.

View Current Viewing Revision #4 from 11/24/2019 12:42 p.m.

Developer

Jiajun Xu

Country of Origin

Start Year

2018

Project Type

Open Source

Written in

Supported languages

Operating Systems

Linux, OS X, Windows

Licenses

Apache v2