TrailDB

TrailDB is an easy portable C library that allows you to query a series of events. Unlike many other relational DBMS, the databases created by TrailDB are immutable files. Thus, TrailDB could achieve a deeper compression, greater scalability and allow an arbitrary number of consumers to access the databases in parallel.

But as each TrailDB is an immutable file, it could not be modified it once creation process is finalized so that it could not be used to replace DMBS that are currently running on production environment for transactions. One proper way is to use it as a complementary to the existing relational databases and key-value stores. For example, data could be gathered in regular intervals from all producers to encode a new TrailDBs and these immutable TrailDBs could be pushed to an object-based storage system, such as AWS S3, PureStorage FlashBlade, and so on. Then customers could use TrailDB library to access these immutable files in high performance and in parallel. As the description shows, it's a pretty good tool for OLAP of non-realtime records.

Another thing brought by the immutability is that it might be quite convenient for developers to use it as in high-level it only supports CREATE and READ operations. Thus, there are only a few APIs and no need to worry about any anomalous states which exist on other relational databases or key-value stores.

Stored Procedures

Not Supported

Isolation Levels

Serializable

Each database is a read-only immutable file. Thus, it is equivalent to Serializable.

Query Interface

Custom API

System Architecture

Embedded

Logging

Not Supported

Concurrency Control

Not Supported

As each TrailDB is an immutable file, modifications are not allowed. Besides, there's only one process to produce a database and no one can issue read operations before the creation is finalized. Thus, there's no concurrency in TrailDB.

Compression

Delta Encoding Run-Length Encoding Prefix Compression

First, within a trail, events are always sorted by time. Thus, it utilizes Delta Encoding to compress the 64-bit timestamps.

Second, since events are grouped by UUID, which usually represents a logical entity such as an online shopping customer, these events within a trail tend to be predictable and TrailDB only encodes every change in behavior. This is not exactly the same as the Run-Length Encoding but similar.

Third, Huffman Coding, which is a kind of Prefix Compression method, is used to encode the skewed, low-entropy distributions of values.

Indexes

Hash Table

This feature is introduced in TrailDB 0.6. [TODO]

TrailDB Logo
Website

http://traildb.io/

Source Code

https://github.com/traildb/traildb

Tech Docs

http://traildb.io/docs

Developer

AdRoll Inc.

Country of Origin

US

Start Year

2014

Project Type

Commercial, Open Source

Written in

C

Supported languages

C, D, Go, Haskell, Python, R

Licenses

MIT