SequoiaDB is a distributed relational database with a storage layer and a computing layer.
The storage layer is a database storage engine that uses the Raft algorithm to achieve data consistency across distributed nodes.
The computing layer consists of relational database instances, which can be a MySQL instance, a semi-structured data access interface via, for example, JSON APIs, or an unstructured data storage interface with, for example, AWS S3.
Key features of SequoiaDB include distributed OLTP with availability and consistency guarantees, petabyte-level horizontal scalability, Hybrid Transactional / Analytical Processing (HTAP), and 2-region 3-data-center recovery mechanisms.
The system implements logical logging and log replay to support data consistency across distributed replicas.
SequoiaDB uses a JSON format record as a unit of data storage. These records are stored in collections. Collections are stored in "collection spaces".
SequoiaDB supports three levels of isolation - read uncommitted, read committed, and read stability. By default, the system is configured as read uncommitted.
Indexes in SequoiaDB use conventional B-trees. An index has a unique name for the index on the data collection and a JSON object that defines the indexing criteria and direction. Indexes can be unique or non-unique. If an index is unique, it can be null-able or non-null-able.
In addition to regular indexes, SequoiaDB supports full-text searching via Elasticsearch.
SequoiaDB supports relational, semi-structured (e.g. JSON), and unstructured (e.g. POSIX file) data models. Data model in storage is JSON.
The system is configurable to deploy on a single node or on multiple distributed nodes. In a distributed environment, coordination nodes and catalog nodes share disk space, while storage nodes are given separate disk space. The storage nodes can be configured to share disk or each to be given nonshared disk.
SequoiaDB uses BSON (binary JSON) format to encode and store JSON format data.