Development of PosDB started in 2017 as a research project. The goals were to explore the perspectives of late materialization in a distributed environment and to study its impact on the processing of complex queries containing a large number of joins, subqueries and nontrivial aggregation (e.g., window functions).
Nested Loop Join Hash Join Sort-Merge Join Broadcast Join Shuffle Join Semi Join
PosDB supports both local and distributed joins with arbitrary partitioning and replication schemas. The system supports hash, nested-loop, and sort-merge algorithms. Also, experimental branches support some exotic joins like band-join.
PosDB utilizes a blocked version of the Volcano query execution model and introduces two phases of query execution: before and after the so-called materialization point. Each of these phases has different operators that work with different kinds of data: positions (row ids) and tuples correspondingly.
Positional operators work with columnar data and are specifically designed for massive filtering, filtering joins and network communication. Lightweight intermediates and good cache locality are important here.
On the other hand, tuple-based operators target aggregation, which needs multiple operations for each wide row. The first (lowest) of these operators is a materialization point: positional data is transformed into tuples. The transformation is sometimes coupled with grouping and window functions to reduce the amount of materialized data.
PosDB supports inter- and intra- query parallelism. To implement the latter PosDB uses two special operators. Asynchronizer allows it to execute a single operator tree in a separate thread and UnionAll is used to collect data from several subtrees that are executed in their own threads.
PosDB is a natively distributed DBMS in terms of both data and query execution. Each table may be fragmented and replicated across multiple nodes. A number of table-level fragmentation strategies are supported: round-robin, hash and range partitioning. Distributed query execution allows PosDB to run a query on multiple nodes, with each node processing an arbitrary part of the query plan. Both positional and tuple operators can be executed on arbitrary nodes, regardless of where their children reside. Also, several operators support internal distribution embedded into their core algorithms, like distributed join and aggregation.
PosDB supports a subset of SQL which consists of selection, projection, join, ordering, aggregation, and window functions (a naive rule-based optimizer is run). It does not support nested queries, updates, deletion, and data definition language (DDL).
It also provides a C++ interface that allows manual query plan construction with exact details of used algorithms and distributed schema.
Since PosDB is a disk-based system all data resides on disk. Disk subsystem efficiency is provided via columnar storage and a buffer manager. The former reduces disk load and helps in homogenizing processed data. A buffer manager allows important data to stay in memory for longer, including the case of several concurrently running queries.
https://research.jetbrains.org/groups/information_lab/projects?project_id=38
PosDB team
2016