Brytlyt is a GPU-accelerated DBMS built on top of Postgres.
Brytlyt is GPU accelerated database that is based on PostgreSQL 9.4 and uses a Massively Parallel Processing (MPP) architecture to provide horizontal scale out for handling large amounts of data. Brytlyt was first released in 2016.
Brytlyt supports the filtering, sorting, aggregating, grouping and joining tables by GPU acceleration. Brytlyt stores data in CPU memory in vectorized columns (not in tuples/rows) to optimize parallel processing across all available GPUs. When the operation begins to be executed, the data is moved as from CPU memory to GPU memory, and returned back to CPU memory when the operation is completed. Brytlyt supports JOIN operations directly on the GPU. It breaks the data into blocks and then distributes the blocks to various GPU cores used for searching (horizontal partitioning).
Note that the Brytlyt database is derived from PostgreSQL. According to the lecture given by Richard Heyns, the CEO of Brytlyt, Brytlyt team does not change the logic in the isolation level. The SQL standard of PostgreSQL defines four levels of transaction isolation, among them serializable is the most strict isolation level.
## Challenge The traditional approach for running joins on CPU and is not well suited for the hundreds of thousands of cores in a GPU system, because the nested looping is not a good choice to be parallelized. Note that the GPU have cores grouped in chunks and each chunk executes the same instructions, the nested looping could not fully uses the GPU feature for more parallelism. ## Approach Brytlyt claims to apply the patent-pending method, which recursively separates rows containing a hit from rows that do not, to solve the challenge. In details, Brytlyt breaks the data into blocks and then distributes the blocks to the many cores used for searching. It begins from a croase-grained block and then digs into into the fine-grained block until the joining operation is finished. ## Example For example, a dataset of 400,000 rows would be broken into blocks of 200 rows on a 2000-core GPU. Each GPU core then runs its own search on its own block of data in parallel with all the other cores. In traditional single CPU environment, it requires 2000 loops to complete, but in GPU environment, it only takes roughly 1 loop time. After the searching stage, the empty blocks (no hit) are discarded, and the process repeated with the remaining blocks (hit at least one). Then the whole process is done over and over until only the relevant blocks remain. According to the experiment, 10 billion rows could be distributed over 100 GPUs and achieve exactly the same cycle time as 1 billion rows on 10 GPUs.
Note that the Brytlyt database is derived from PostgreSQL. According to the lecture given by Richard Heyns, the CEO of Brytlyt, Brytlyt team does not change the logic in the concurrency control level. PostgreSQL supports a WAL mechanism that we can support reverting to any time instant covered by the available WAL data.
In Brytlyt, not all things are executed on the GPU. The Tuple-at-a-Time model is much more suitable for the CPU execution, however, the GPU acceleration relies on the vectorized method to increase the parallelism. Please refer to the join section for more information about vectorization method used in Brytlyt.
Brytlyt has adopted the open source database PostgreSQL within which to implement its intellectual property. With PostgreSQL comes the full suite of SQL and programmatic SQL functionality. SQL is the Lingua Franca of the data processing world.
Brytlyt is a disk-oriented database, where the table and index are majorly stored on the disk. But it also fully schedules the memory structure between CPU host memory and GPU memory to accelerate the query and avoid the extra overhead led by the data transferring. GPUs improve performance only when data is in the main system memory; hence it’s much better to keep hot data in main memory, instead of keeping hot data in the disk. The performance of GPU-accelerated database highly depends on the data transfer efficiency. And the data transfers can be significantly accelerated by keeping ‘semi-hot data’ in host memory and hot data in GPU RAM. However, sometimes the executed relation could not fit into the GPU memory entirely, Brytlyt still needs to transfer the data from CPU to GPU frequently when processing a large relation.
Note that the Brytlyt database is derived from PostgreSQL and Brytlyt team does not change the logic in the storage model where the DBMS stores all of the attributes for a single tuple contiguously as a N-ary storage model.
Brytlyt is a forked branch of the PostgreSQL with additional Brytlyt GPU Manager and Nvidia GPUs embedded in the database engine. It doesn't support multi-master shared-storage but cold standby failure for shared-storage.
In Brytlyt, "CREATE VIEW" defines a view of a query. The view is not physically materialized. Instead, the query is run every time the view is referenced in a query. Also, the name of the view must be distinct from the name of any other view, table, sequence, index or foreign table in the same schema.
C, C++, Delphi, Java, Perl, Python, Tcl
Linux, OS X, Windows