Silo

Silo is an in-memory database system. Silo tries to avoid all kinds of contention points by using an epoch-based commit protocol with optimistic concurrency control. This commit protocol makes it possible for Silo to support serializability while avoiding all writes on shared memory for read transactions. Silo can achieve high performance on multi-core machines because of this commit protocol.

System Architecture

Shared-Everything

Any worker has access to the entire database.

Concurrency Control

Optimistic Concurrency Control (OCC)

Silo uses optimistic concurrency control for its transactions, and it is based on epochs. Silo keeps a global epoch number and each Silo worker keeps an local copy of this global epoch number. This epoch number is useful for resolving conflicts, garbage collection and serializable recovery. Each transaction allowed to commit will be assigned a transaction ID, and this transaction ID contains many information. It records transaction and record versions and can serve as lock (lock bit), and detect conflicts.

The commit protocol is divided into three phases.

Phase I: Check and lock all records in write set. Fetch the global epoch number and assign its value to local epoch number.

Phase II: Check all records in read set. If there is any record has a different transaction ID or any record has been locked, this means this record has been modified or is being modified, this is read-after-write hazard, release all locks and abort.

Phase III: If current transaction is not aborted, assign a transaction ID to this transaction, apply the change made by this transaction and release all locks.

Indexes

MassTree

Silo's table is a collection of index tree. There is one primary index tree and several secondary index tree. Silo's index tree uses Masstree. Masstree's read does not write on shared memory but use version number and fence-based synchronization. Also, Masstree adopts some features of trie so key comparisons are more optimized than data structure like B tree.

Isolation Levels

Serializable

The commit protocol of Silo can provide serializable level of isolation level, since the serializability can be reduced to strict two-phase locking.

Logging

Physical Logging

Silo uses background logger threads to do logging. Each time a transaction got committed, a new log record is created by local worker, and this log record contains information about table/key/value information for all records that have been modified. Log records are stored in worker's local buffer. When the buffer is full or the worker enters next epoch, log records in buffer will be pushed to logger threads' per-worker queue. Logger threads calculate durable epoch (transactions with epochs smaller than this durable epoch are all durable) based on the epoch information contained in log records. Logging is on record-level and log record is created after the transaction commits, so there is no need to have undo or redo operations.

Query Compilation

Not Supported

Storage Architecture

In-Memory

Silo is an in-memory database.

Query Interface

Custom API

Silo only supports one-shot request, and currently one-shot request can only be written in C++, which provides the ability to manipulate Silo directly. SQL one-shot request has not been implemented.

Data Model

Relational

Silo is a relational database.

Website

http://dl.acm.org/citation.cfm?id=2522713

Source Code

https://github.com/stephentu/silo

Developer

Stephen Tu, Eddie Kohler

Country of Origin

US

Start Year

2013

Project Type

Academic, Open Source

Written in

C++, Python

Licenses

MIT