FoundationDB

FoundationDB is a distributed non-relational database that supports ACID transactions and OLTP workloads. FoundationDB decouples its data storage technology from its data model. All data is stored as an ordered key-value data structure and can be remapped to custom data models or indexes by using user-written layer module API. FoundationDB doesn’t have any separate query language, it only exposes API to access data.

FoundationDB was famous for having a very rigorous and thorough testing of their fault tolerance. They built their own [deterministic testing] (https://www.youtube.com/watch?v=4fFDFbi3toc) while developing their system to make sure their system implementation behaves correctly. The simulation was built to model real-life scenarios, such as a combination of transaction executions while having network failure, database configuration change, dumb system admin etc. Jepsen didn't even need to test FoundationDB because of FoundationDB’s rigorous simulation.

History

FoundationDB is built to handle high-load transaction processing with high-performance with strong guarantee (ACID).

FoundationDB is originally built in 2009 by three co-founders, Dave Rosenthal, Dave Scherer, Nick Lavezzo. The founders used to work for Visual Sciences; an analytics company (now is a subsidiary of Adobe). The company acquired Akiban in 2013. FoundationDB was acquired by Apple in 2015. In 2018, Apple open-sourced FoundationDB under Apache License 2.0.

Isolation Levels

Serializable

FoundationDB uses Optimistic Concurrency Control to achieve Serializable Isolation level. This can be achieved because all modifications to key-value data store are done via transaction.

System Architecture

Shared-Nothing

FoundationDB uses shared-nothing architecture. Every time the DBMS writes data, the data is distributed by pieces to different nodes.

FoundationDB has a couple of components to handle scalability:

  1. Coordinators Coordinators communicate and store a small amount of data for fault-tolerant purposes. Coordinators do not involve in transactions.

  2. Cluster Controller Cluster controller is an entry point of all processes in the cluster which is elected by coordinators.

  3. Master Master coordinates proxies, transactions logs, and resolvers. Master also runs data distribution algorithm and ratekeeper.

  4. Proxies Proxies track storage servers, provide read versions, and committing transactions.

  5. Transaction Logs Receive commits from the proxy, write and fsync data to the append-only logs on disk, and respond to proxy. Once the data is written to disk, storage servers pop the data from the log.

  6. Resolvers Hold the last 5 seconds of committed transactions to detect conflicting transactions. The Resolvers make sure the transactions’ read are valid according to MVCC.

  7. Storage Servers Store data based on the range of keys assigned. Storage servers keep the most fresh data in memory (< 5 seconds old) and the rest of data are located on disk.

Committing transactions are a sequence of steps achieved by different components: - Master: Provides a commit version to proxies. - Resolvers: Check whether the current transaction has a conflict with previously committed transactions. - Proxies: Send the valid commits to transactions logs and wait until transaction logs has logged the transaction.

Foreign Keys

Not Supported

Data Model

Key/Value

FoundationDB exposes a single data model, an ordered Key-Value data model. Both keys and values are byte strings. To support a richer data-model or index, a user can write his own custom layer module API to remap the Key-Value data model.

Storage Model

Custom

FoundationDB stores data in a Key-Value model.

Storage Architecture

Hybrid

FoundationDB has two storage options, ssd and memory. All data to be read must reside in memory, and all writes will be written to disk with the number of copies based on the redundancy mode. The default DBMS configuration is memory, and the maximum size of data in memory is 1 GB.

Concurrency Control

Multi-version Concurrency Control (MVCC) Optimistic Concurrency Control (OCC)

FoundationDB uses Optimistic Concurrency Control (OCC) for writes and Multiversion Concurrency Control (MVCC) for reads. The DBMS only maintains conflicting transaction information for a five second period. Thus, it doesn't support long-running read/write transactions. Conflicting transactions will fail at commit and the client is responsible to retry the transactions.

Query Interface

Custom API

The only way to model the data and query them is by writing layer. FoundationDB only allows the user to interact with the data through their custom API in Python, Ruby, Java, Go, or C.

The DBMS used to support SQL layer in 2014 but it is not actively supported and maintained anymore.