Silo is an in-memory database system. Silo tries to avoid all kinds of contention points by using an epoch-based commit protocol with optimistic concurrency control. This commit protocol makes it possible for Silo to support serializability while avoiding all writes on shared memory for read transactions. Silo can achieve high performance on multi-core machines because of this commit protocol.
Silo uses optimistic concurrency control for its transactions, and it is based on epochs. Silo keeps a global epoch number and each Silo worker keeps an local copy of this global epoch number. This epoch number is useful for resolving conflicts, garbage collection and serializable recovery. Each transaction allowed to commit will be assigned a transaction ID, and this transaction ID contains many information. It records transaction and record versions and can serve as lock (lock bit), and detect conflicts. The commit protocol is divided into three phases. Phase I: Check and lock all records in write set. Fetch the global epoch number and assign its value to local epoch number. Phase II: Check all records in read set. If there is any record has a different transaction ID or any record has been locked, this means this record has been modified or is being modified, this is read-after-write hazard, release all locks and abort. Phase III: If current transaction is not aborted, assign a transaction ID to this transaction, apply the change made by this transaction and release all locks.
Silo's table is a collection of index tree. There is one primary index tree and several secondary index tree. Silo's index tree uses Masstree. Masstree's read does not write on shared memory but use version number and fence-based synchronization. Also, Masstree adopts some features of trie so key comparisons are more optimized than data structure like B tree.
Silo uses background logger threads to do logging. Each time a transaction got committed, a new log record is created by local worker, and this log record contains information about table/key/value information for all records that have been modified. Log records are stored in worker's local buffer. When the buffer is full or the worker enters next epoch, log records in buffer will be pushed to logger threads' per-worker queue. Logger threads calculate durable epoch (transactions with epochs smaller than this durable epoch are all durable) based on the epoch information contained in log records. Logging is on record-level and log record is created after the transaction commits, so there is no need to have undo or redo operations.
Stephen Tu, Eddie Kohler
Academic, Open Source