Tarantool

Tarantool is an integration of a Lua application server and a database management system. The DBMS was originally developed as an in-memory NoSQL DBMS, and later it was extended with a disk storage engine option. Tarantool's in-memory engine is lock-free. It uses cooperative multitasking to handle thousands of connections simultaneously. There is a fixed number of independent execution threads and they do not share states. The disk-based storage engine also exploits the advantage of single-threaded requests and hence avoid unnecessary synchronization overhead. Tarantool also supports secondary indexes, asynchronous replication, and some SQL operations.

History

Tarantool’s creator and biggest user is Mail.Ru, which is the largest internet company in Russia. Although Mail.Ru is the sponsor for product development, Tarantool is open-sourced, incorporating patches from dozens of community contributors. Tarantool's recent stable release is 1.10, and its 2.x version is currently under beta release.

Indexes

B+Tree Hash Table BitMap R-Tree

Tarantool has two storage engines: (1) memtx, the in-memory storage engine (2) vinyl, the on-disk storage engine. The in-memory storage engine memtx is the default engine and first to be developed.

Memtx engine's supported indexes are TREE, HASH, RTREE and BITSET.

Vinyl only supports TREE index.

Data Model

Key/Value

The basic data unit is a tuple, composed of fields. A tuple means a 'row' or 'record'. Tuples must have a primary index, and can have secondary indexes (can be non-unique). Fields are similar to regular 'record fields', except that (1) they can be composite structures, (2) they do not need to have names. Any tuple may have an arbitrary number of fields, and the fields may be of different types. Tuples are stored as MsgPack arrays.

A space is a container for tuples, and a space should have a unique identifier and a designated storage engine.

Foreign Keys

Not Supported

Tarantool is a NoSQL DBMS that do not support foreign keys.

Storage Model

Custom

Tuples in Tarantool are stored as MsgPack arrays. The storage engine maintains indexes that maps keys to tuple pointers.

Query Interface

Custom API Stored Procedures Command-line / Shell

Tarantool is incorporated with an application server, and provides a command-line console. The native language to use it to write applications is by Lua, but languages like C/C++/Python are also supported. Tarantool supports triggers in Lua and stored procedures in Lua/C. Starting from 2.0 (currently beta release), Tarantool supports some basic SQL operations.

Concurrency Control

Optimistic Concurrency Control (OCC)

Tarantool uses one single thread for processing all transactions of a database instance, which is called 'transaction processor thread'. Thus the design is lock-free. Transactions occur in fibers on that single thread. A fiber is a set of instructions that may contain “yield” signals (yield can be either explicit or implicit, e.g., system calls). The transaction processor thread will execute all computer instructions until a yield, and then schedule a switch to another potentially ready fiber. This scheduling scheme is called cooperative scheduling. It means that unless a running fiber deliberately yields control, it cannot be preempted by other fibers. Thus, a transaction's author has the responsibility not to write long-running computations without a yield. There is also a 'network thread' that parses and ships messages, and a write ahead logging thread. While this design limits the number of cores that a DBMS can use, it removes competition for the memory bus and ensures high scalability of memory access and network throughput.

When transaction commits, a yield happens and changes are written to WAL. A simple optimistic scheduler is used: the first transaction to commit wins. Any active transaction that has read a value modified by a committed transaction will abort. Moreover, Tarantool's cooperative scheduler implementation ensures that, in absence of yields, a multi-statement transaction is not preempted and thus will never be aborted.

System Architecture

Shared-Nothing

Tarantool supports asynchronous replication, either locally or on remote hosts. Tarantool supports both master-replica and master-master configurations.

In master-replica configuration, replicas can only serve reads. A replica gets synchronization from the master by continuously fetching and applying the write ahead log (WAL).

In master-master configuration, any node can handle both read and write requests. Tarantool only guarantees that each change on a master is propagated to all nodes and is applied only once. However, changes from different masters can be mixed and applied in a different order on different nodes.

Isolation Levels

Serializable

Since Tarantool only uses a single thread for transactions and the first transaction to commit always wins, only serializable isolation level is supported. The release note mentioned that Tarantool is 'serializable Snapshot Isolation (SSI)'.

Logging

Logical Logging

Tarantool uses write ahead logging (WAL). Tarantool writes each data change request (insert, update, delete, replace, upsert) into a WAL file, and is thus logical logging.

Storage Architecture

Disk-oriented In-Memory

The in-memory storage engine memtx is the default engine. The disk-based storage engine Vinyl can be used when data cannot fit in memory, but it lacks some functions and options that are available with memtx.

Vinyl's underlying data structure is log-structured merge-trees (LSM trees). Vinyl is different to common libraries like RocksDB in that it utilizes the DBMS property that transactions execute in a dedicated thread. Thus it allows it to remove the unnecessary locks, interprocess communication, and other overhead.

Views

Virtual Views

Tarantool supports SQL views starting from its alpha release to support SQL. Currently, only CREATE/DROP VIEW are supported, thus materialized views are not yet supported.

Checkpoints

Non-Blocking

Tarantool uses write ahead logging (WAL), thus checkpoints are necessary to limit the log file size. In the docs, checkpoints are mentioned as snapshots. Users can either force the DBMS to take a snapshot, or enable automatic creation of snapshot files. Users can control the number of snapshots stored and the snapshot interval.

During a snapshot, copy-on-write and multi-version concurrency control techniques are used. When the master process changes part of a primary key, the corresponding page splits and the snapshot process obtains an old copy of the page. Hence taking a snapshot do not need to block.

Tarantool Logo
Website

http://tarantool.org/

Source Code

https://github.com/tarantool/tarantool

Tech Docs

https://tarantool.io/en/doc/1.10/

Developer

Tarantool

Country of Origin

RU

Start Year

2008

Project Type

Commercial, Open Source

Written in

C

Supported languages

C, C++, Lua, Python

Derived From

SQLite

Operating Systems

Linux

Licenses

BSD