XTDB is an bitemporal document DBMS that uses Apache Kafka for the primary storage of transactions and documents, and RocksDB or LMDB to host indexes for rich query support. Its bitemporal support allows the system to store and query data on two different factors, valid time and system time.
- Website
- https://xtdb.com[01]
- Source Code
- https://github.com/xtdb/xtdb[02]
- Tech Docs
- https://xtdb.com/docs[03]
- @xtdb_com
- Developer
- Country of Origin
- GB
- Start Year
- 2018 [08]
- Former Name
- Crux
- Project Types
- Commercial, Open Source
- Written in
- Clojure
- License
- MIT License
XTDB does not enforce any schema for the documents it stores. It supports a Datalog query interface for reading data and traversing relationships across all documents, where queries are executed so that the results are lazily streamed. Additionally, even though the main transaction log is immutable, XTDB still supports the eviction of active as well as historical data.
XTDB is an bitemporal document DBMS that uses Apache Kafka for the primary storage of transactions and documents, and RocksDB or LMDB to host indexes for rich query support. Its bitemporal support allows the system to store and query data on two different factors, valid time and system time.
XTDB does not enforce any schema for the documents it stores. It supports a Datalog query interface for reading data and traversing relationships across all documents, where queries are executed so that the results are lazily streamed. Additionally, even though the main transaction log is immutable, XTDB still supports the eviction of active as well as historical data.
History[04]
XTDB has been available as a Public Alpha since April 19th 2019. The Public Alpha period will continue until XTDB is released as a Generally Available open source software product by JUXT later in 2019.
XTDB was renamed from Crux in October 2021.
Data Model[05]
The documents in XTDB are all stored as Extensible Data Notation (EDN) documents. The fields within this documents are triples, which have entity, attribute, and value fields. This data model enables XTDB to efficiently execute graph queries in Datalog.
Indexes[06]
XTDB uses RocksDB or LMDB in order to host its indexes.
- RocksDB uses two different formats for its indexes: block based table and plain table. In a block based table, it is easier to compress the data into blocks, but queries take longer to execute. In plain table, the data is stored in a hash table, so it takes more space to store the data, but queries execute faster.
- LMDB uses two different B+ trees for its indexes format. One of the B+ trees stores pages with data, and the other stores free pages that empty up after deletes.
Joins[07]
The language that XTDB uses to execute queries, Datalog, has the same functionality as SQL, but allows for more efficient joins. It uses nested loop joins and sorted merge joins as does SQL, but it also uses joins over granular indexes. This ensures that the DBMS does not have to worry about normalizing the data or what the shape of the data is.
Logging[07]
XTDB stores a transaction log that contains high-level information about the transactions that have been executed, but doesn't contain specific information such as what tuples have been changed.
Query Execution[07]
When XTDB runs a Datalog query, it outputs a lazy sequence of all of the tuples that satisfy all of the clauses in the query. This means that as the database finds tuples that satisfy the predicate, it outputs the tuples one at a time. Therefore, query execution is done using the Tuple-at-a-Time Model.
Query Interface[05]
XTDB supports SQL and Datalog. The latter allows Crux to read data and explore relationships across various different documents. XTDB's Datalog interface supports most SQL-like join operations and recursive graph traversals.
Storage Architecture[07]
XTDB does not support node-level sharding, so every XTDB node has the same data, and this is the same data that is stored on disk. When a change is made, all of the nodes must incorporate this change, so that the nodes are all consistent with each other. XTDB may add functionality for node-level sharding in the future.
Storage Model[07]
XTDB has a local Key/Value store that is stored as a map where each key maps to its corresponding value.
Storage Organization[07]
XTDB uses Apache Kafka as a means of storing the transaction and document logs. These logs are semi-immutable and decoupled from the actual XTDB compute node.
An alternative method of storage organization that XTDB can use instead of Kafka is a local log store that operates within a XTDB standalone node.
System Architecture[07]
Each XTDB node reads and writes from disk, and each node should always store the full database that is stored on disk. This is because XTDB does not support sharding, so each node has to keep track of all of the data. When data has to be updated to a node, it is automatically updated to the disk.
Citations
10 sources- XTDB xtdb.com
- GitHub - xtdb/xtdb: An immutable SQL database for application development, time-travel reporting and data compliance. Developed by @juxt · GitHub github.com
- Documentation and Resources · XTDB xtdb.com
- "crux" => "xtdb" in CLA; switching to PDF * RTF is still available for users who want the source github.com
- xtdb/README.md at 1.x · xtdb/xtdb · GitHub github.com
- A Tutorial of RocksDB SST formats · facebook/rocksdb Wiki · GitHub github.com
- https://opencrux.com/docs#_introduction opencrux.com
- Initial Commit github.com
- https://github.com/xtdb/xtdb/commit/6b074935939e2da9f882ff0c147acd80b31779ea github.com
- https://github.com/xtdb/xtdb/commit/a375c63b5ff02da097412014f34db61ef725c903 github.com