Hypertable

Hypertable is an open source database modeled after Bigtable, Google's massively scalable database. It goes real-time at Baidu. Hypertable runs on top of a distributed file system. It supports HDFS, MapR, Ceph, KFS, and local. It is developed by C++.

Query Interface

Custom API Command-line / Shell

Hypertable provides the Hypertable Query Language (HQL) to create, modify, and query tables. HQL can also be used to invoke administrative commands. HQL can be interpreted by hypertable command line interface (ht shell), Thrift API methods, and Hypertable::HqlInterpreter C++ class.

Concurrency Control

Multi-version Concurrency Control (MVCC)

Hypertable uses Multi-Version Concurrency Control (MVCC). It uses auto-assign timestamps as revision numbers.

Joins

Not Supported

Hypertable does not support Joins.

Storage Model

Custom

Isolation Levels

Snapshot Isolation

Hypertable provides snapshot isolation for queries with 8-byte timestamps.

Stored Procedures

Not Supported

Storage Architecture

Disk-oriented

Hypertable is able to run on top of any filesystem. A File System (FS) broker process all filesystem requests. FS brokers currently support HDFS, MapR, Ceph, KFS, and local (for running on top of a local filesystem).

Checkpoints

Consistent

Hypertable backup by outputting table data in random order. Hypertable will always be back into a consistent and operational state at the checkpoint.

Foreign Keys

Not Supported

Data Model

Column Family

Hypertable uses a set of related columns.

System Architecture

Shared-Disk

The diagram below provides a high-level overview of the Hypertable system followed by a brief description of each system component.

System Architecture of Hypertable

Hyperspace - This is Hypertable's equivalent to Google's Chubby service. Hyperspace is a lock manager and provides a filesystem for storing small amounts of metadata.

Master - The master handles all meta operations such as creating and deleting tables. The master is also responsible for detecting range server failures and re-assigning ranges if necessary.

Range Server - Range servers are responsible for managing ranges of table data, handling all reading and writing of data.

FS Broker - Hypertable is capable of running on top of any filesystem. To achieve this, the system has abstracted the interface to the filesystem by sending all filesystem requests through a File System (FS) broker process. The FS broker provides a normalized filesystem interface and translates normalized filesystem requests into native filesystem requests and vice-versa. FS brokers have been developed for HDFS, MapR, Ceph, KFS, and local (for running on top of a local filesystem).

ThriftBroker - Provides an interface for applications written in any high-level language to communicate with Hypertable. The ThriftBroker is implemented with Apache Thrift and provides bindings for applications written in Java, PHP, Ruby, Python, Perl, and C++.