YugaByte DB

YugaByte DB is a transactional database management system that can scale up and down across multiple regions for planet-scale and geo-distributed applications. According to the [CAP theorem][cap], YugaByte DB is consistent and partition tolerant. Combining SQL and NoSQL in one platform, it supports distributed ACID transactions, auto-sharding, and auto-balancing. Besides PostgreSQL-compatible SQL API, it provides another two APIs extended from Redis commands and Apache Cassandra Query Language (CQL), respectively. Built on a customized version of RocksDB, YugaByte DB's storage engine, DocDB, is a log-structured merge-tree (LSM) based "key to object/document" store. [cap]: https://en.wikipedia.org/wiki/CAP_theorem

History

YugaByte DB's first public beta release came out in November 2017. It was initially developed by the former team that built and ran Facebook's NoSQL platform that supported a number of Facebook's real-time applications. They left Facebook and found their own company, YugaByte Inc, aiming to build a database management system to unify the data layer for these mission-critical applications. Companies with lots of experts are able to offer complex DBaaS platforms which hide internal details of the data layer and benefit their app developers. However, for traditional enterprises and small startups, the data layers are mostly coupled within the applications. This is where YugaByte DB targets to rescue.

Query Interface

Custom API SQL

YugaByte DB offers the following three query APIs: - **YCQL:** Cassandra-compatible API that supports DDL/DML statements, builtin functions, expression operators, and user-defined data types. - **YEDIS:** Redis-compatible API that supports most Redis commands and data types. - **YSQL (beta):** PostgreSQL-compatible API that supports DDL/DML statements, builtin functions, expression operators, and user-defined data types.

Views

Virtual Views

YugaByte DB supports non-materialized views in its YSQL API.

Concurrency Control

Multi-version Concurrency Control (MVCC) Optimistic Concurrency Control (OCC)

YugaByte DB uses MVCC for concurrency control. Although not clearly stated, it uses a variant of OCC to ensure atomicity. Under a distributed environment, it uses Two-Phase Commit with Early Acknowledgement. When a transaction wants to modify a number of rows, it first writes "provisional" records of each modified row into the target tablet storing the row. These records cannot be seen by the client unless the transaction commits. If conflicts occur when writing these records, the transaction will restart and abort. Otherwise, the transaction commits and notifies success to client. After that, the "provisional" records are applied and cleaned asynchronously.

Joins

Not Supported

Only YugaByte DB's PostgreSQL-compatible API YSQL supports join operations (i.e., inner, outer, left, and right join). The other two APIs, YEDIS and YCQL, do not support this operation. The documentation does not clear states what join algorithms are used internally.

Storage Model

Custom

YugaByte DB's storage model depends on RocksDB, which uses Static Sorted Table (SST) format.

Isolation Levels

Snapshot Isolation

Currently, YugaByte DB only supports Snapshot Isolation and is still working on supporting Serializable Isolation.

Storage Architecture

Disk-oriented

YugaByte DB is a disk-oriented database management system. However, as its storage engine is implemented as a log-structured merge-tree (LSM), some of the data will be stored in memory before flushed out to disk.

Foreign Keys

Not Supported

YugaByte DB does not support foreign keys as none of its `CREATE TABLE` syntax offers keywords to set foreign key constraints.

Data Model

Key/Value Document / XML

YugaByte DB's storage engine, DocDB, is based on RocksDB. Unlike RocksDB, DocDB is a "key to object/document" store instead of a "key to value" store. Values in DocDB can be primitive types as well as object types (e.g., lists, sorted sets, and sorted maps) with arbitrary nesting.

System Architecture

Shared-Nothing

YugaByte DB uses shared-nothing system architecture. A table will be split into multiple tablets. Depending on the replication factor, each tablet has its corresponding number of replicas (tablet peers) across different nodes.

Compression

Dictionary Encoding

Relying on RocksDB, YugaByte DB's storage engine is responsible for converting every supported data formats (i.e., documents, CQL rows, and Redis data) to key-value pairs and store them in RocksDB. How data compression is accomplished in YugaByte DB depends on how it is done in RocksDB, which uses Dictionary Compression.

Storage Organization

Log-structured

YugaByte DB's storage engine relies on RocksDB, which is implemented as a log-structured merge-tree (LSM).

YugaByte DB Logo
Website

https://www.yugabyte.com/

Source Code

https://github.com/yugabyte/yugabyte-db

Tech Docs

https://docs.yugabyte.com/latest/

Developer

YugaByte, Inc.

Country of Origin

US

Start Year

2016

Project Type

Commercial, Open Source

Written in

C++

Supported languages

C, C#, C++, Go, Java, JavaScript, Python

Derived From

RocksDB

Inspired By

Cloud Spanner

Compatible With

Cassandra, PostgreSQL, Redis

Operating Systems

Linux, OS X

Licenses

Apache v2