PosDB is a distributed disk-based column-store that leverages late materialization idea to deal with complex analytical workloads.
Development of PosDB started in 2017 as a research project. The goals were: 1) to explore the perspectives of late materialization in a distributed environment and 2) to study its impact on the processing of complex queries containing large join chains, subqueries and nontrivial aggregation like window functions.
PosDB supports a subset of SQL which consists of selection, projection, join, ordering, aggregation, and window functions (a naive rule-based optimizer is run). It does not support nested queries, updates, deletion, and data definition language (DDL).
It also provides a C++ interface that allows manual query plan construction with exact details of used algorithms and distributed schema.
Since PosDB is a disk-based system all data resides on disk. Disk subsystem efficiency is provided via columnar storage and a buffer manager. The former reduces disk load and helps in homogenizing processed data. A buffer manager allows important data to stay in memory for longer, including the case of several concurrently running queries.
For now, PosDB doesn't support transactions. Data is mostly considered read-only, while cold bulk-loading is supported.
Logical level: relational model. Physical level: columnar model
PosDB supports both local and distributed joins with arbitrary partitioning and replication schemas. Any of hash, nested-loop and sort-merge algorithms can be used. Also, experimental branches support some exotic joins like band-join.
PosDB currently doesn't support query compilation.
PosDB is a column-oriented DBMS and it stores data by columns.
PosDB utilizes a blocked version of the Volcano query execution model and introduces two phases of query execution: before and after the so-called materialization point. Each of these phases has different operators that work with different kinds of data: positions (row ids) and tuples correspondingly.
Positional operators work with columnar data and are specifically designed for massive filtering, filtering joins and network communication. Lightweight intermediates and good cache locality are important here.
On the other hand, tuple-based operators target aggregation, which needs multiple operations for each wide row. The first (lowest) of these operators is a materialization point: positional data is transformed into tuples. The transformation is sometimes coupled with grouping and window functions to reduce the amount of materialized data.
PosDB supports inter- and intra- query parallelism. To implement the latter PosDB uses two special operators. Asynchronizer allows it to execute a single operator tree in a separate thread and UnionAll is used to collect data from several subtrees that are executed in their own threads.
PosDB is a natively distributed DBMS in terms of both data and query execution. Each table may be fragmented and replicated across multiple nodes. A number of table-level fragmentation strategies are supported: round-robin, hash and range partitioning. Distributed query execution allows PosDB to run a query on multiple nodes, with each node processing an arbitrary part of the query plan. Both positional and tuple operators can be executed on arbitrary nodes, regardless of where their children reside. Also, several operators support internal distribution embedded into their core algorithms, like distributed join and aggregation.