RubatoDB is an academic database project started by Dr. Li-Yan Yuan at University of Alberta, Canada. It falls into the category of a NewSQL system. It aims to provide the scalable performance similar to NOSQL systems while maintaining the traditional ACID guarantees present in relational databases. SQL support is provided as the primary language with interfaces such as JDBC and ODBC. It has been implemented through a staged architecture consisting of a grid of staged modules connected through explicit queues. It implements a formula protocol for distributed concurrency control, a layer on top of Berkeley DB providing three levels of consistency guarantees. All table partitions and files along with the indexes are stored as Berkeley DB files where the transactional layer of Berkeley DB is switched off.
The name Rubato has been taken from the Italian word, "rubare". This literally translates to soft and subtle rhythmic changes in performance. This corresponds from RubatoDB's support for various types of consistencies giving full freedom. RubatoDB was developed as a part of the NewSql movement that started in 2009. Traditional NoSql systems were highly scalable horizontally but were schema free and provided only relaxed consistency. The NewSql systems wanted to achieve the availability and horizontal scalability of NoSql systems while at the same time also wanted to preserve the ACID guarantees of a transaction combined with the functionalities of a traditional relational database like supporting joins, tables etc.
Multi-version Concurrency Control (MVCC)
Concurrency Control in RubatoDB is provided through two different layers :-
The BASE and BASIC models differ on choosing one spectrum of the CAP theorem, either providing instant availability with fast queries or providing consistent results with higher latency.
Each Berkeley DB node supports a WAL(Write Ahead LOG). This ensures that all data is not immediately written to disk, rather the transactions are made durable by appending the LOG records on the disk. Berkeley DB nodes follow traditional ARIES algorithm to persist the log on the disk and recover in the case of a crash.
The SQL engine present in RubatoDB is responsible for processing all queries. It is composed of a set of staged grip modules each comprising of a software module on a node having its own request queue. Threads pull requests one after the other from the input queue and invoke the various components, like parser, query optimizer, query processor, update etc. They then fill the output queue with the results which are then used up. This structure supports both parallelism and pipe lined execution.
RubatoDB has employed a hybrid storage partition model that allows the partitioning of a table both in the horizontal and the vertical dimension and being stored separately over the network of grid nodes. All disk accesses are made through Berkeley DB as all the partitions are stored as Berkeley DB files. The user can specify partitioning schemes to incorporate human optimizations based on precursory knowledge of the workload. A tree based schema is present for Grid Partitioning. Descendant tuples are partitioned according to the ancestor they descended from, ie for every row in the parent table there must be a group of rows in the descendant table.
A hybrid storage model which partitions the table both in the horizontal and vertical dimensions and then stores the table separately over different grid nodes is employed. Each row of the table is stored on a separate node in the grid while within a row, a range of columns are stored as
Indexed Sequential Access Method (ISAM)
IBM designed ISAM(Indexed Sequential Access Method) to support both sequential and random access of the records. The sequential access is done just by a sequential scan through the records. The random accesses are supported using an index where each separate index defines a different ordering of the records. The underlying Berkeley DB layer is based on an ISAM storage organization.
RubatoDB follows a Staged Event-Driven Architecture. The individual tasks are divided into Finite State Machines and the transitions between the states of the FSM are triggered by events. The architecture can be visualized as a network of nodes acting as staged modules connected by queues which are explicitly associated. SEDA breaks the execution plan into a series of stages where each stage corresponds to a subset of states from the FSM. This is now an independent identity with its own queue. It pulls tasks from the incoming queue, performs the operations and forwards it to the respective output queues.
http://webdocs.cs.ualberta.ca/~yuan/databases/rubatodb/rubatodb_dist.4.0.2.tar.gz
https://webdocs.cs.ualberta.ca/~yuan/databases/rubatodb/docs/rubatodb.html
University of Alberta
2014