SequoiaDB is a distributed relational database with a storage layer and a computing layer.
The storage layer is a database storage engine that uses the Raft algorithm to achieve data consistency across distributed nodes.
The computing layer consists of relational database instances, which can be a MySQL instance, a semi-structured data access interface via, for example, JSON APIs, or an unstructured data storage interface with, for example, AWS S3.
Key features of SequoiaDB include distributed OLTP with availability and consistency guarantees, petabyte-level horizontal scalability, Hybrid Transactional / Analytical Processing (HTAP), and 2-region 3-data-center recovery mechanisms.
SequoiaDB implements transactions, but does not support concurrency control options. Transactions are limited to only include record insert, delete, update, and query (read) operations. Other database operations such as creating new index are not supported in transactions and will not be logged as part of the transaction.
The system implements logical logging and log replay to support data consistency across distributed replicas.
SequoiaDB uses a JSON format record as a unit of data storage. These records are stored in collections. Collections are stored in "collection spaces".
SequoiaDB supports three levels of isolation - read uncommitted, read committed, and read stability. By default, the system is configured as read uncommitted.
Indexes in SequoiaDB use conventional B-trees. An index has a unique name for the index on the data collection and a JSON object that defines the indexing criteria and direction. Indexes can be unique or non-unique. If an index is unique, it can be null-able or non-null-able.
In addition to regular indexes, SequoiaDB supports full-text searching via Elasticsearch.
If the system is deployed on distributed clusters, it can be configured with either range-based or hash-based sharding.
SequoiaDB supports relational, semi-structured (e.g. JSON), and unstructured (e.g. POSIX file) data models. Data model in storage is JSON.
The system is configurable to deploy on a single node or on multiple distributed nodes. In a distributed environment, coordination nodes and catalog nodes share disk space, while storage nodes are given separate disk space. The storage nodes can be configured to share disk or each to be given nonshared disk.
SequoiaDB uses BSON (binary JSON) format to encode and store JSON format data.
Parallel query execution is supported via individual query configuration.