PosDB

View Current Viewing Revision #10 from 05/01/2023 10:31 p.m.

PosDB is a distributed disk-based column-store that leverages late materialization idea to deal with complex analytical workloads. Posdb stands for "Positional database", which refers to its unique indexing technique. It uses the positional indexing technique to efficiently store and retrieve data. PosDB is a columnar DBMS with SQL support designed for analytic workloads. Unlike classic columnar systems, query execution plans in column stores can include operators that work not only with values but also with positions. In this way, PosDB features fluent data-position handling and produces two types of intermediate representations that can be used during query processing. This leads to unconventional query plans which significantly improve query performance by conserving disk bandwidth and enabling sophisticated query processing techniques. It also features other natural benefits of column-stores, such as efficient column-level compression.

History

The PosDB project started in 2017 as a research effort to study late materialization-oriented query processing. The developers focused on query processing issues not studied in PosDB’s predecessors, namely: LM and processing of aggregation queries (including window function processing), distributed processing in LM environments, subqueries and LM, etc.

Checkpoints

Not Supported

According to George Chernishev, the main developer of PosDB, currently, PosDB focuses only on read-only workloads with bulk loading of data. Therefore, there are no checkpoints for transactions now. As for reliability, they have some homebrew mechanisms for handling network failures, but it is rather simple.

Compression

Bit Packing / Mostly Encoding

It has implemented a generalized I/O-RAM compression scheme in which the compression algorithm is a parameter that can be changed. They now support both light-weight and heavy-weight algorithms including PFOR, VByte, SIMD-FastPFOR128, SIMD-BinaryPacking128 and Brotli (default configuration). The pages are decompressed as soon as they arrive from disk.

Concurrency Control

Not Supported

The system target read-only workloads.

Data Model

Relational

Data model is relational with tables stored on disk as separate columns.

Foreign Keys

Not Supported

Isolation Levels

Not Supported

PosDB does not support isolation since currently the developers focus only on read-only workloads with bulk loading of data.

Joins

Nested Loop Join Hash Join Sort-Merge Join Broadcast Join Shuffle Join Semi Join

PosDB supports both local and distributed joins with arbitrary partitioning and replication schemas. The system supports hash, nested-loop, and sort-merge algorithms. Also, experimental branches support some exotic joins like band-join.

Logging

Not Supported

PosDB only supports equi-join and includes NestedLoopJoin, MergeJoin, and HashJoin operators.

Parallel Execution

Inter-Operator (Vertical)

PosDB can store data on several machines, as well as process each query using several threads and a set of machines (inter- and intra-query parallelism).

Query Compilation

Not Supported

PosDB currently doesn't support query compilation.

Query Execution

Vectorized Model

PosDB utilizes a blocked version of the Volcano query execution model and introduces two phases of query execution: before and after the so-called materialization point. Each of these phases has different operators that work with different kinds of data: positions (row ids) and tuples correspondingly.

Positional operators work with columnar data and are specifically designed for massive filtering, filtering joins and network communication. Lightweight intermediates and good cache locality are important here.

Using the Volcano execution model, PosDB represents each query by a tree of operators that support the iterator interface. Other than iterators, PosDB also supports vectorized processing. PosDB supports parallel operators including SyncReader. PosDB has two operators for supporting intraquery parallelism: Asynchronizer and UnionAll. Asynchronizer is a unary operator that only executes the child operator in a thread while UnionAll is a polyadic operator that executes all child operators and then merges the results.

On the other hand, tuple-based operators target aggregation, which needs multiple operations for each wide row. The first (lowest) of these operators is a materialization point: positional data is transformed into tuples. The transformation is sometimes coupled with grouping and window functions to reduce the amount of materialized data.

PosDB supports inter- and intra- query parallelism. To implement the latter PosDB uses two special operators. Asynchronizer allows it to execute a single operator tree in a separate thread and UnionAll is used to collect data from several subtrees that are executed in their own threads.

PosDB is a natively distributed DBMS in terms of both data and query execution. Each table may be fragmented and replicated across multiple nodes. A number of table-level fragmentation strategies are supported: round-robin, hash and range partitioning. Distributed query execution allows PosDB to run a query on multiple nodes, with each node processing an arbitrary part of the query plan. Both positional and tuple operators can be executed on arbitrary nodes, regardless of where their children reside. Also, several operators support internal distribution embedded into their core algorithms, like distributed join and aggregation.

Query Interface

Custom API SQL

PosDB supports a subset of SQL which consists of selection, projection, join, ordering, aggregation, and window functions (a naive rule-based optimizer is run). It does not support nested queries, updates, deletion, and data definition language (DDL).

It also provides a C++ interface that allows manual query plan construction with exact details of used algorithms and distributed schema.

Storage Architecture

Disk-oriented

Since PosDB is a disk-based system all data resides on disk. Disk subsystem efficiency is provided via columnar storage and a buffer manager. The former reduces disk load and helps in homogenizing processed data. A buffer manager allows important data to stay in memory for longer, including the case of several concurrently running queries.

Storage Model

Decomposition Storage Model (Columnar)

PosDB is a column-oriented DBMS.

Storage Organization

Indexed Sequential Access Method (ISAM)

The data is stored on disk and the system has a storage manager which manages the data and provides access to them. Each replica is stored in a separate file without paging and the buffer manager can help with efficient query execution.

PosDB supports materialized views and utilize the late materialization strategy. It also implements other optimizations including multi-columns and ultra-late materialization. It can support a wide range of queries including window functions.

Revision #10 | Updated 05/01/2023 10:31 p.m.