Brytlyt

Brytlyt is a GPU-accelerated DBMS built on top of Postgres.

History

Brytlyt is GPU accelerated database that is based on PostgreSQL 9.4 and uses a Massively Parallel Processing (MPP) architecture to provide horizontal scale out for handling large amounts of data. Brytlyt is first released in 2016, now located at the milestone 2.0 (Dec.2018) and it has a long-term roadmap to version 5.0.

But now, Brytlyt has rearchitected the GPU database to be a storage engine that sits underneath MariaDB, the fork of the MySQL database that is controlled by its original creator, Monty Widenius.

This wiki is based on the PostgreSQL version which has been released publicly.

Foreign Keys

Supported

Storage Model

N-ary Storage Model (Row/Record)

Note that the Brytlyt database is derived from PostgreSQL. According to the lecture given by Richard Heyns, the CEO of Brytlyt, Brytlyt team does not change the logic in the storage model where the DBMS stores all of the attributes for a single tuple contiguously as a N-ary storage model.

Data Model

Object-Relational

Storage Architecture

Hybrid

Brytlyt is a disk-oriented database, where the table and index are majorly stored on the disk. But it also fully schedules the memory structure between CPU host memory and GPU memory to accelerate the query and avoid the extra overhead led by the data transferring.

GPU Data Source

GPUs improve performance only when data is in the main system memory; hence it’s much better to keep hot data in main memory, instead of keeping hot data in the disk.

Optimized GPU-CPU Memory Control

The performance of GPU-accelerated database highly depends on the data transfer efficiency. And the data transfers can be significantly accelerated by keeping ‘semi-hot data’ in host memory and hot data in GPU RAM. But since the GPU RAM is smaller (GBs) vs. host memory (TBs), data has to be still transferred over x16 PCIe bus.

PCIe Data Transfer Topic

To avoid PCIe bottlenecks and use the full capabilities of CPU and GPU, it is better to have a ratio of 1:1. This would allow optimal processing for a given operation. Otherwise, the "N-to-1" relation leads to the transfer bottleneck.

System Architecture

Shared-Everything

Brytlyt is a forked branch of the PostgreSQL with additional Brytlyt GPU Manager and Nvidia GPUs embedded in the database engine. It doesn't support multi-master shared-storage but cold standby failure for shared-storage.

To further explain the problem, Brytlyt supports four levels of parallelism, level 1: Coordinating multiple machines level 2: Coordinating multiple GPUs level 3: Data streaming on and off the device level 4: Each GPU is a parallel machine

Logging

Physical Logging

Note that the Brytlyt database is derived from PostgreSQL. According to the lecture given by Richard Heyns, the CEO of Brytlyt, Brytlyt team does not change the logic in the concurrency control level.

By archiving the WAL data we can support reverting to any time instant covered by the available WAL data: it simply installs a prior physical backup of the database, and replay the WAL log just as far as the desired time. What's more, the physical backup doesn't have to be an instantaneous snapshot of the database state — if it is made over some period of time, then replaying the WAL log for that period will fix any internal inconsistencies.

Concurrency Control

Multi-version Concurrency Control (MVCC)

Note that the Brytlyt database is derived from PostgreSQL. According to the lecture given by Richard Heyns, the CEO of Brytlyt, Brytlyt team does not change the logic in the concurrency control level.

It applies Multi-version Concurrency Control for data consistency. For MVCC, not only the current status but also previous values of data are visible to the transaction, which provides transaction isolations. The primary advantage of MVCC overlocking is that the writing operation won't conflict with the reading operation on the same block of data. Thus, MVCC reduces the lock contention to achieve high throughput.

Query Interface

PL/SQL

Brytlyt has adopted the open source database PostgreSQL within which to implement its intellectual property. With PostgreSQL comes the full suite of SQL and programmatic SQL functionality. SQL is the Lingua Franca of the data processing world.

Joins

Hash Join

Challenge

The traditional approach for running joins on CPU and is not well suited for the hundreds of thousands of cores in a GPU system. Since GPU’s have cores grouped in chunks, with each chunk running the same instructions, most GPU Databases have a tough time with join operations.

Approach

Brytlyt has approached the parallelism challenge by devising a patent-pending method that recursively separates rows containing a hit from rows that do not. It breaks the data into blocks and then distributes the blocks to the many cores used for searching.

Example

For example, a dataset of 400,000 rows would be broken into blocks of 200 rows on a 2000-core GPU. Each GPU core then runs its own search on its own block of data in parallel with all the other cores, giving a huge boost in performance over the traditional CPU Database.

Empty blocks are discarded, and the process repeated with the remaining blocks. Then the whole process is done over and over until only the relevant blocks remain. This is an easily scalable process, and the importance of that cannot be overestimated. 10 billion rows could be distributed over 100 GPUs and achieve exactly the same cycle time as 1 billion rows on 10 GPUs.

Query Execution

Tuple-at-a-Time Model Vectorized Model

In Brytlyt, not all things are executed on the GPU. The Tuple-at-a-Time model is much more suitable for the CPU execution, however, the GPU acceleration relies on the vectorized method to increase the parallelism. Please refer to the join section for more information about vectorization method used in Brytlyt.

Views

Virtual Views

In Brytlyt, "CREATE VIEW" defines a view of a query. The view is not physically materialized. Instead, the query is run every time the view is referenced in a query. Also, the name of the view must be distinct from the name of any other view, table, sequence, index or foreign table in the same schema.

Hardware Acceleration

GPU

The database operation accelerated by the CPU must be parallelizable, and in many cases parallelizing an operation is not trivial. Relational operations like filtering, sorting, aggregating, grouping and even joining tables are all possible on GPU.
Data usually resides in CPU memory in vectorized columns to optimize parallel processing across all available GPUs. The data is moved as needed to GPU memory for both mathematical and spatial calculations, and the results then returned to CPU.
Brytlyt develops a unique approach to accelerate JOINS by GPU. by devising a patent-pending method that recursively separates rows containing a hit from rows that do not. It breaks the data into blocks and then distributes the blocks to the many cores used for searching.
Disk-IO bottleneck and PCIe bottleneck are two challenges in the development of the GPU-accelerated database. In order to avoid PCIe bottlenecks and use the full capabilities of CPU and GPU, brytlyt suggests having a ratio of 1:1.

Isolation Levels

Read Uncommitted Read Committed Serializable Repeatable Read

Note that the Brytlyt database is derived from PostgreSQL. According to the lecture given by Richard Heyns, the CEO of Brytlyt, Brytlyt team does not change the logic in the isolation level.

The SQL standard defines four levels of transaction isolation. The most strict is Serializable, which is defined by the standard in a paragraph which says that any concurrent execution of a set of Serializable transactions is guaranteed to produce the same effect as running them one at a time in some order. The other three levels are defined in terms of phenomena, resulting from interaction between concurrent transactions, which must not occur at each level. The standard notes that due to the definition of Serializable, none of these phenomena are possible at that level.

Revision #20 | Updated 12/05/2018 8:29 p.m.

Brytlyt

History

Foreign Keys

Storage Model

Data Model

Storage Architecture

GPU Data Source

Optimized GPU-CPU Memory Control

PCIe Data Transfer Topic

System Architecture

Logging

Concurrency Control

Query Interface

Joins

Challenge

Approach

Example

Query Execution

Views

Hardware Acceleration

Isolation Levels

Website

Tech Docs

Developer

Country of Origin

Start Year

Project Type

Written in

Supported languages

Derived From

Operating Systems

Licenses