XTDB

XTDB is an bitemporal document DBMS that uses Apache Kafka for the primary storage of transactions and documents, and RocksDB or LMDB to host indexes for rich query support. Its bitemporal support allows the system to store and query data on two different factors, valid time and system time.

XTDB does not enforce any schema for the documents it stores. It supports a Datalog query interface for reading data and traversing relationships across all documents, where queries are executed so that the results are lazily streamed. Additionally, even though the main transaction log is immutable, XTDB still supports the eviction of active as well as historical data.

History

XTDB has been available as a Public Alpha since April 19th 2019. The Public Alpha period will continue until XTDB is released as a Generally Available open source software product by JUXT later in 2019.

XTDB was renamed from Crux in October 2021.

Data Model

Document / XML

The documents in XTDB are all stored as Extensible Data Notation (EDN) documents. The fields within this documents are triples, which have entity, attribute, and value fields. This data model enables XTDB to efficiently execute graph queries in Datalog.

Indexes

B+Tree Hash Table

XTDB uses RocksDB or LMDB in order to host its indexes.

RocksDB uses two different formats for its indexes: block based table and plain table. In a block based table, it is easier to compress the data into blocks, but queries take longer to execute. In plain table, the data is stored in a hash table, so it takes more space to store the data, but queries execute faster.
LMDB uses two different B+ trees for its indexes format. One of the B+ trees stores pages with data, and the other stores free pages that empty up after deletes.

Joins

Nested Loop Join Sort-Merge Join

The language that XTDB uses to execute queries, Datalog, has the same functionality as SQL, but allows for more efficient joins. It uses nested loop joins and sorted merge joins as does SQL, but it also uses joins over granular indexes. This ensures that the DBMS does not have to worry about normalizing the data or what the shape of the data is.

Logging

Logical Logging

XTDB stores a transaction log that contains high-level information about the transactions that have been executed, but doesn't contain specific information such as what tuples have been changed.

Query Execution

Tuple-at-a-Time Model

When XTDB runs a Datalog query, it outputs a lazy sequence of all of the tuples that satisfy all of the clauses in the query. This means that as the database finds tuples that satisfy the predicate, it outputs the tuples one at a time. Therefore, query execution is done using the Tuple-at-a-Time Model.

Query Interface

SQL Datalog

XTDB supports SQL and Datalog. The latter allows Crux to read data and explore relationships across various different documents. XTDB's Datalog interface supports most SQL-like join operations and recursive graph traversals.

Storage Architecture

Disk-oriented

XTDB does not support node-level sharding, so every XTDB node has the same data, and this is the same data that is stored on disk. When a change is made, all of the nodes must incorporate this change, so that the nodes are all consistent with each other. XTDB may add functionality for node-level sharding in the future.

Storage Model

N-ary Storage Model (Row/Record)

XTDB has a local Key/Value store that is stored as a map where each key maps to its corresponding value.

Storage Organization

Log-structured

XTDB uses Apache Kafka as a means of storing the transaction and document logs. These logs are semi-immutable and decoupled from the actual XTDB compute node.

An alternative method of storage organization that XTDB can use instead of Kafka is a local log store that operates within a XTDB standalone node.

System Architecture

Shared-Disk

Each XTDB node reads and writes from disk, and each node should always store the full database that is stored on disk. This is because XTDB does not support sharding, so each node has to keep track of all of the data. When data has to be updated to a node, it is automatically updated to the disk.

Revision #12 | Updated 06/27/2022 12:38 a.m.

XTDB

History

Data Model

Indexes

Joins

Logging

Query Execution

Query Interface

Storage Architecture

Storage Model

Storage Organization

System Architecture

People Also Viewed

Website

Source Code

Tech Docs

Twitter

Developer

Country of Origin

Start Year

Former Name

Project Type

Written in

Supported languages

Embeds / Uses

Licenses

People Also Viewed