Tarantool

Tarantool is an integration of a Lua application server and a database management system. The DBMS was originally developed as an in-memory NoSQL DBMS, and later it was extended with a disk storage engine option. Tarantool's in-memory engine is lock-free. It uses cooperative multitasking to handle thousands of connections simultaneously. There is a fixed number of independent execution threads and they do not share state. The disk-based storage engine exploits the advantage of single-threaded requests too and hence avoid unnecessary locks. Tarantool supports asynchronous replication.

History

Tarantool’s creator and biggest user is Mail.Ru, which is the largest internet company in Russia. Although Mail.Ru is the sponsor for product development, the development is open-sourced, incorporating patches from dozens of community contributors. Most of its components are written from scratch, and the DBMS is still under improvement.

Storage Architecture

Disk-oriented In-Memory

Tarantool has two storage engines: (1) memtx, the in-memory storage engine (2) vinyl, the on-disk storage engine. The in-memory storage engine memtx is the default engine and first to be developed. Vinyl can be used when data cannot fit in memory, but it lacks some functions and options that are available with memtx. Vinyl's underlying data structure is LSM trees.

Logging

Logical Logging

Tarantool uses write ahead logging (WAL). Tarantool writes each data change request (insert, update, delete, replace, upsert) into a WAL file, and is thus logical logging.

Data Model

Key/Value

The basic unit is a tuple, composed of fields. A tuple means a 'row' or 'record'. Tuples must have a primary index, and can have secondary indexes (can be non-unique). Fields are similar to regular 'record fields', except that (1) they can be composite structures, (2) they do not need to have names. Any tuple may have an arbitrary number of fields, and the fields may be of different types. Tuples are stored as MsgPack arrays.

A space is a container for tuples, and a space should have a unique identifier and a designated storage engine.

Foreign Keys

Not Supported

Tarantool is a NoSQL DBMS that do not support foreign keys.

Concurrency Control

Optimistic Concurrency Control (OCC)

Tarantool uses one single thread for processing all transactions of a database instance, which is called 'transaction processor thread'. Thus the design is lock-free. Transactions occur in fibers on that single thread. A fiber is a set of instructions that may contain “yield” signals (yield can be either explicit or implicit, e.g., system calls). The transaction processor thread will execute all computer instructions until a yield, and then schedule a switch to another potentially ready fiber. This scheduling scheme is called cooperative scheduling. It means that unless a running fiber deliberately yields control, it cannot be preempted by other fibers. Thus, a transaction's author has the responsibility not to write long-running computations without a yield.

When transaction commits, a yield happens and changes are written to WAL. A simple optimistic scheduler is used: the first transaction to commit wins. Any active transaction that has read a value modified by a committed transaction will be aborted. Moreover, Tarantool's cooperative scheduler implementation ensures that, in absence of yields, a multi-statement transaction is not preempted and thus will never be aborted.

Query Interface

SQL Stored Procedures Command-line / Shell

Tarantool is incorporated with an application server, and provides a command-line console. The native language to use it is by Lua, but C/C++ can also be used to write applications based on Tarantool. It supports triggers and stored procedures in Lua. Starting from 2.0, Tarantool supports some basic SQL operations.

Checkpoints

Non-Blocking

Tarantool uses write ahead logging (WAL), thus checkpoints are necessary. In the docs, checkpoints are mentioned as snapshots. Users can either force the DBMS to take a snapshot, or enable automatic creation of snapshot files. Users can control the number of snapshots stored and the snapshot interval.

During a snapshot, copy-on-write and multi-version concurrency control is used. When the master process changes part of a primary key, the snapshot process obtains an old copy of the page.

Indexes

B+Tree Hash Table BitMap R-Tree

Tarantool has two storage engines: (1) memtx, the in-memory storage engine (2) vinyl, the on-disk storage engine. The in-memory storage engine memtx is the default engine and first to be developed.

Memtx engine's supported indexes are TREE, HASH, RTREE and BITSET.

Vinyl only supports TREE index. The underlying implementation is LSM trees.

Storage Model

Custom

Tuples in Tarantool are stored as MsgPack arrays. The storage engine maintains indexes that maps keys to tuple pointers.

Isolation Levels

Serializable

Since Tarantool only uses a single thread for transactions and the first transaction to commit always wins, only serializable isolation level is supported. The release note mentioned that Tarantool is 'serializable Snapshot Isolation (SSI)'.

Tarantool Logo
Website

http://tarantool.org/

Source Code

https://github.com/tarantool/tarantool

Tech Docs

https://tarantool.io/en/doc/1.10/

Developer

Tarantool

Country of Origin

RU

Start Year

2005

Project Type

Commercial, Open Source

Written in

C

Supported languages

Lua

Derived From

SQLite

Operating Systems

Linux

Licenses

BSD