RisingLight

RisingLight is an educational OLAP DBMS written in Rust.

History

The RisingLight project was initiated by Mingji Han, a PhD student at Boston University. The motivation is to build an OLAP database system that is simple enough to learn with modern programming technologies.

Query Interface

SQL

RisingLight supports PostgreSQL as a query interface. It provides an interactive shell for users to issue SQL queries.

Concurrency Control

Not Supported

RisingLight doesn't support transactions.

Logging

Not Supported

Joins

Nested Loop Join Hash Join

RisingLight supports nested loop join and hash join. When there is at least one equal condition and the join type is inner join, RisingLight will use hash join. Otherwise, nested loop join will be used.

Storage Model

Decomposition Storage Model (Columnar)

A user table is cut horizontally as RowSets, and then vertically into column files. Data within one column are stored continuously in multiple column files.

Query Execution

Vectorized Model

RisingLight uses vectorized volcano model (aka. chunk at a time) for query execution. Furthermore, some arithmetic expressions have special SIMD implementation to make execution faster.

Stored Procedures

Not Supported

Indexes

Log-Structured Merge Tree

In RisingLight, each table corresponds to a merge-tree on disk. Inside each merge-tree, there are multiple RowSets. Each RowSet is composed of multiple column files. Each column file contains multiple blocks, which is the minimal unit of the cache. RowSets can be sorted or unsorted. For sorted RowSets, the sparse index file will record the first cell of each block, so as to support efficient range scan.

Storage Architecture

Disk-oriented

RisingLight stores data on disk. Data are organized on disk using merge-tree structure. There are multiple RowSet directories inside the RisingLight database directory, where each RowSet belongs to a user table, organized in a merge-tree. Inside each RowSet directory, there are multiple column files and column index files.

Checkpoints

Non-Blocking Consistent

RisingLight provides consistent snapshot by using merge-tree structure in the storage engine. The manifest file stores a full list of files of the database. Every write from users will produce a group of files called RowSet, and the information of the RowSet will be added to the manifest file.

Foreign Keys

Not Supported

Data Model

Relational

RisingLight stores data in a relational way.

System Architecture

Embedded

Query Compilation

Not Supported

Compression

Run-Length Encoding

The minimal encoding unit of RisingLight is data blocks, about 64KB in size, containing multiple cells in a table. RisingLight supports run-length block encoding.

Parallel Execution

Inter-Operator (Vertical)

In RisingLight, different executors can be running on different threads.