Bedrock is a networking and distributed transaction layer built on top of SQLite. It is a distributed relational database management system designed for geo-replication. It was initially made for (and is owned by) Expensify, the expense management company.
BedRock is the system that backs Expensify, the expense management company. It had been used for eight years prior to being launched. It was originally created as an in house solution to the strict database constraints of financial institutions - response time within milliseconds, transaction logging and authentication, and replication of multiple servers.
Read Uncommitted Serializable Snapshot Isolation
Bedrock inherits isolation level support from SQLite. The default behaviour is Serializeable. Snapshot Isolation can be implemented by setting PRAGMA journal_mode = WAL
. If PRAGMA read_uncommitted = True
along with the setting fro Snapshot Isolation, the isolation level becomes Read Uncommitted.
Nested Loop Join Sort-Merge Join
Bedrock inherits join support from SQLite. SQLite uses nested loop joins, and has been criticised previously for its slow performance. SQLite supports Sort-Merge joins over unique keys.
Multi-version Concurrency Control (MVCC) Two-Phase Locking (Deadlock Prevention) Two-Phase Locking (Deadlock Detection)
Bedrock inherits concurrency control from SQLite. SQLite maintains page locks using Two-Phase Locking.
However, Bedrock has its own proprietary synchronization engine to support concurrency over multiple servers.
Bedrock's synchronization engine is a private distributed general ledger, i.e, a private blockchain. Each thread has an internal table called journal
, which has 3 columns called id
, query
, hash
. Each time a query is committed to the database, a new row is inserted into the journal
. The new row records the query, and calculates the new incremental hash based on the previous row. When a server connects to a cluster, the most recent id
and hash
are broadcasted. If two servers disagree on the the hash
corresponding to the id
, then they know that they have "forked" at some point and stop communicating with each other. A Paxos-based election scheme decides which fork stands up to the new master.
Since Bedrock supports multi-threaded writes, it is prone to write conflicts. This is addressed by "sharding" the table, and querying all the journal
tables in a UNION
whenever the database is to be viewed as one.
Bedrock queries can be any SQLite compatible query. The result is returned in an HTTP-like / JSON format, as per user request. It also support the MySQL protocol, and hence the user can continue using the MySQL client of their choice. It also provides a PHP binding that one can use to work with it from the shell itself.
N-ary Storage Model (Row/Record)
Bedrock stores data in a SQLite database. This stores data in a row-wise, where each row is referred to as a tuple. The tuples are stored contiguously on each page, and can be stored across multiple pages.
https://github.com/Expensify/Bedrock
https://github.com/Expensify/Bedrock/tree/master/docs
Expensify
2016