ScyllaDB is an open-source distributed wide-column NoSQL database offering high availability, scalability and fault-tolerance, all while maintaining predictable low latencies and high throughput. ScyllaDB is compatible with both Apache Cassandra (CQL, SSTables) and Amazon DynamoDB interfaces. Written in C++, ScyllaDB uses the highly asynchronous shard-per-core, shared-nothing Seastar framework (http://seastar.io/), where each thread executes on its own CPU core, memory, and multi-queue network interface controller. Cross-core communication is carried out by explicit asynchronous, message passing.
ScyllaDB supports non-blocking checkpoints through per-node backup procedures, which include full backup/snapshots and incremental backup. Snapshots are taken by the snapshot operation provided by the nodetool utility, while the incremental backup option can be configured in the configuration file. Automatic unnecessary backup cleaning is not implemented.
Multi-version Concurrency Control (MVCC)
ScyllaDB does not support ACID transactions as in RDBMS. However, CQL has a BATCH
statement that allows multiple update statements belonging to a given partition key be applied in isolation (note that batches are not a full analogue for SQL transactions). Besides, in UPDATE
, INSERT
, and DELETE
statements, modifications belonging to the same partition key are performed atomically and in isolation. ScyllaDB implements Multi-Version Concurrency Control (MVCC) for partition mutation. Internally, versions are represented by an ordered list of states, where each state is a delta of current mutation.
Further, ScyllaDB supports compare-and-set (CAS) and strict linearizability of operations using Lightweight Transactions (LWT), which uses an underlying Paxos algorithm implementation. As such, it is "ACID" compliant so long as operations are limited to one data partition.
Column Family / Wide-Column Key/Value
ScyllaDB uses the same wide column NoSQL data model as Apache Cassandra, which represents data in a key-key-value format (like row in RDBMS). The first key is the partition key, with the second key being a clustering key used for sorting of rows within a partition. ScyllaDB organizes a collection of rows as a column family (like table in RDBMS). One or more column families are contained in a keyspace (like database in RDBMS). It is encouraged that one application should use one keyspace.
ScyllaDB supports both primary key and secondary key indexes. For primary index, ScyllaDB hashes the key and finds the corresponding partition in the consistent hashing ring; within the partition, ScyllaDB finds the row in a sorted data structure (SSTable).
For secondary indexes, ScyllaDB maintains an index table for the secondary index keys, where the value for each key is the (primary) partition keys associated with the secondary key. Whenever a secondary index is queried, ScyllaDB first retrieves the partition keys using the secondary index, then retrieves the records with those partition keys returned by the first step. ScyllaDB supports both local (per node) as well as global (per cluster) secondary indexes.
ScyllaDB does not support ACID transactions as in RDBMS. However, CQL has a BATCH
statement that allows multiple update statements belonging to a given partition key be applied in isolation (note that batches are not a full analogue for SQL transactions). Besides, in UPDATE
, INSERT
, and DELETE
statements, modifications belonging to the same partition key are performed atomically and in isolation.
As well, ScyllaDB supports CQL Light-Weight Transactions (LWT), which allow for compare-and-set (CAS) operations and strict linearizability using a Paxos consensus algorithm.
Furthermore, ScyllaDB also provides a DynamoDB-compatible interface, known as "Alternator." In Alternator, users can choose an isolation level per table.
Scylla uses a shared-nothing model. Nodes in the cluster are organized in a decentralized consistent hashing ring and data is partitioned into shards by the key across all nodes. Scylla uses a shard-per-core architecture, where each thread for a shard executes on its own CPU core, memory, and multi-queue network interface controller. Cross-core communication is carried out by explicit message passing. Scylla also uses replicas for fault-tolerance.
Scylla supports Materialized Views in version 2.0 as an experimental feature. Whenever the base table is updated, the materialized view table will be automatically updated. Materialized View tables are distributed as normal tables and scale as well as normal tables. However, there are still limitations in the current experimental release, including but not limited to lack of local locking and local batch log.
https://github.com/scylladb/scylla
ScyllaDB Inc.
2014
C#, C++, Go, Java, JavaScript, PHP, Python, Ruby, Rust