RocksDB is an embedded database using key-value data, and is developed by Facebook for high performance purposes.
RocksDB is forked from LevelDB, which was developed by Google to exploit the best performance of many CPU cores as well as fast storage like SSD for I/O bound workloads. Based on a log-structured merge-tree, RocksDB is able to achieve very high performance and it is adaptable to different workloads (can be used for various of data needs). RocksDB also supports both basic and advanced database operations, including merging and compaction filters.
RocksDB is written in C++ and it supports API bindings for C++, C, Java, Python, PHP, as well as many other third-party language bindings. RocksDB is used in production in several large companies such as Facebook, Yahoo!, and LinkedIn.
In an attempt to extend HDFS's success from Data Analysis to Query Serving workloads (this workload requires low latency), Dhruba Borthakur enhanced HBase and make its latencies twice as slow as MySQL server. Then when flash storage came out, it became clear that a new storage engine was needed to serve a random workload efficiently.
He started seeking for new techniques to build next generation key-value store, especially for serving data that resides on fast storage. Since he used flash storage, the network data access was 50% higher overhead than local data access, which meant that embedded database within an application could have much slower latency than applications that access data across the network.
At that time, there were several existing embedded database: BerkeleyDB, SQLite3, as well as leveldb, which was the fastest according to open-source benchmarks, plus its written in c++, thus leveldb became the first choice for his benchmarking.
Soon he found out that leveldb was not suitable for the Query Serving workloads. leveldb only works well when the size of the database is smaller than the size of RAM. Additionally, leveldb's single-threaded compaction process was insufficient to drive server workloads, and it was not able to consume all the I/Os that were offered by the underlying flash storage.
Finally, he decided that the best path was to fork the leveldb code and change its architecture to suit the needs (use a database that can drive fast storage hardware), then RocksDB was born.
Optimistic Concurrency Control (OCC)
RocksDB supports two kinds of transactions: TransactionDB and OptimisticTransactionDB. Transactions have BEGIN
, COMMIT
and ROLLBACK
APIs, allowing applications to modify the data concurrently while RocksDB is checking conflicts. RocksDB supports both pessimistic and optimistic concurrency control.
Disk-oriented In-Memory Hybrid
RocksDB is designed to make full use of the fast accessing speed of flash storage. However, it also supports pure memory storage.
https://github.com/facebook/rocksdb/
https://github.com/facebook/rocksdb/wiki
2012
C, C++, Go, Java, Perl, PHP, Python, Ruby