YugaByte DB

YugaByte DB is a transactional database management system that can scale up and down across multiple regions for planet-scale and geo-distributed applications. Offering both SQL and NoSQL in one platform, it supports distributed ACID transactions, auto-sharding, and auto-balancing. It provides compatible APIs extended from Redis commands and Apache Cassandra Query Language (CQL), and SQL API which is compatible with PostgreSQL. Built on a customized version of RocksDB, YugaByte DB's storage engine, DocDB, is a log-structured merge-tree (LSM) based "key to object/document" store.

History

YugaByte DB's first public beta release came out in November 2017. It was developed by the former team that built and ran Facebook's NoSQL platform that supported a number of real-time applications. They left Facebook and found their own company, YugaByte Inc, aiming to build a database management system to unify the data layer for mission-critical applications. Companies like Amazon, Netflix, and Uber have lots of experts, so they manage to offer complex DBaaS platforms which benefit their app developers. However, for traditional enterprises and small startups, the data layer is still an unsolved problem. This is where YugaByte DB comes to rescue.

Query Interface

Custom API SQL

YugaByte DB offers the following three query APIs: - **YCQL:** Cassandra-compatible API that supports DDL/DML statements, builtin functions, expression operators, and user-defined data types. - **YEDIS:** Redis-compatible API that supports data types including string, hash, set, sorted set, list, and time seris (new in YugaByte DB). - **YSQL (beta):** PostgreSQL-compatible API that supports DDL/DML statements, builtin functions, expression operators, and user-defined data types.

Concurrency Control

Multi-version Concurrency Control (MVCC) Optimistic Concurrency Control (OCC)

YugaByte DB uses MVCC for concurrency control. Although not clearly stated, it uses a variant of OCC to ensure atomicity. Under a distributed environment, it uses Two-Phase Commit with Early Acknowledgement. When a transaction wants to modify a number of rows, it first writes "provisional" records of each modified row to the target tablet storing the row. These records will not be seen by the client unless the transaction commits. If conflicts occur when writing these records, the transaction will restart and abort. In this case, the client will see a certain number of retries (restarts are transparent to clients). If no conflicts occur, the transaction will commit and notify success to client. After that, the "provisional" records are applied and cleaned asynchronously.

Joins

Not Supported

Only YugaByte DB's PostgreSQL-compatible API YSQL (beta) supports join operations (i.e., inner, outer, left, and right join). The other two APIs, YEDIS and YCQL, do not support this operation. The documentation does not clear states what join algorithms are used internally.

Storage Model

Custom

YugaByte DB's storage model depends on RocksDB, which is customed (SST format).

Isolation Levels

Snapshot Isolation

YugaByte DB currently only supports Snapshot Isolation and is still working on supporting Serializable Isolation.

Storage Architecture

Disk-oriented

YugaByte DB is a disk-based database management system. However, as its storage engine is implemented as a log-structed merge-tree (LSM), some of the data will be in memory before flushed out to disk.

Foreign Keys

Not Supported

YugaByte DB does not support foreign keys as none of its `CREATE TABLE` commands offer keywords to set foreign key constraints.

Data Model

Key/Value Document / XML

YugaByte DB's storage engine, DocDB, is based on RocksDB. Unlike RocksDB, DocDB is a "key to object/document" store instead of a "key to value" store. Values in DocDB can be primitive types as well as object types (e.g., lists, sorted sets, and sorted maps) with arbitrary nesting.

System Architecture

Shared-Nothing

YugaByte DB uses shared-nothing system architecture. A table will be split into multiple tablets. Depending on the replication factor, each tablet has its corresponding number of replicas (tablet peers) across different nodes.

Compression

Dictionary Encoding

Since YugaByte DB's storage engine (DocDB) relies on RocksDB, it is responsible for converting every supported data formats (i.e., documents, CQL rows, and Redis data) to key-value pairs and store them in RocksDB. How data compression is performed in YugaByte DB depends on how it is done in RocksDB, which uses Dictionary Compression.

Storage Organization

Log-structured

YugaByte DB's storage engine relies on RocksDB, which is implemented as a log-structured merge-tree (LSM).

YugaByte DB Logo
Website

https://www.yugabyte.com/

Source Code

https://github.com/yugabyte/yugabyte-db

Tech Docs

https://docs.yugabyte.com/latest/

Developer

YugaByte, Inc.

Country of Origin

US

Start Year

2016

Project Type

Commercial, Open Source

Written in

C++

Supported languages

C, C#, C++, Go, Java, JavaScript, Python

Derived From

RocksDB

Inspired By

Cloud Spanner

Compatible With

Cassandra, PostgreSQL, Redis

Operating Systems

Linux, OS X

Licenses

Apache v2