BlazingSQL

Viewing Revision #15 from 2018-12-13 01:04 View Current

BlazingSQL is a distributed GPU-accelerated SQL engine with data lake integration, where data lakes are huge quantities of raw data that are stored in a flat architecture. It is ACID-compliant. BlazingSQL targets ETL workloads and aims to perform efficient read IO and OLAP querying. BlazingDB refers to the company and BlazingSQL refers to the product. It is currently under active development with 15 employees. BlazingDB has offices in San Franscisco and Peru.[03][04][05]

Logo Versions

Website: https://blazingdb.com[01]
Tech Docs: https://docs.blazingdb.com[02]
Developer: BlazingDB
Country of Origin: PE
Start Year: 2015 [28]
Project Type: Commercial
Written in: C++
Supported Languages: SQL
Operating System: Linux
License: Proprietary

Logo Versions

Website: https://blazingdb.com[01]
Tech Docs: https://docs.blazingdb.com[02]
Developer: BlazingDB
Country of Origin: PE
Start Year: 2015 [28]
Project Type: Commercial
Written in: C++
Supported Languages: SQL
Operating System: Linux
License: Proprietary

BlazingSQL

Viewing Revision #15 from 2018-12-13 01:04 View Current

History[06][07][08][09]

BlazingSQL started as a GPU table joiner for multi-terabyte databases. The Aramburu brothers, Rodrigo and Felipe, founded a company in 2013 that provided analytical solutions and needed to speed up joins for pension fraud detection. The system is closed-source with a free community binary. It integrates with the open-source open GPU data science initiative, RAPIDS, which relies on NVIDIA GPUs.

Checkpoints

It is unclear if BlazingSQL supports checkpointing.

Compression[10][11][12][13][14]

Dictionary Encoding Delta Encoding Run-Length Encoding Bit Packing / Mostly Encoding

Historically, BlazingSQL supported compression and decompression on the GPU with bit-packing, delta encoding, dictionary encoding, and run-length encoding. This is currently disabled alongside its custom Simpatico file format. As of November 2018, it operates directly on Apache Parquet, CSV, and ORC. BlazingSQL does not currently write data and instead reads it from the data lake. It is able to operate directly on compressed data.

Concurrency Control[04]

Not Supported

BlazingSQL does not write data. It reads directly from the data lake, loading it into GPU data frames that can be shared with other BlazingSQL worker nodes through interprocess communication. Worker nodes do not have to be on the same machine, they can utilize different machines and different GPUs. BlazingSQL handles concurrency for the generation of result sets. However, the user is responsible for ensuring that the data in the data lake is internally consistent and free of corruption when it is queried.

Data Model[15]

Relational

BlazingSQL is a relational database. It accepts multiple in-memory formats (e.g. Apache Parquet) and provides a SQL interface for querying the data.

Foreign Keys

It is unclear if foreign keys are supported by BlazingSQL.

Hardware Acceleration[16][17]

GPU

BlazingSQL is hardware-accelerated with NVIDIA GPUs. Relevant columnar data is compressed, cached and sent to the GPU. The GPUs are used to speed up transforms, predicates, running predicates while skipping metadata, and to perform accelerated joins.

Indexes

Not Supported

BlazingSQL does not appear to support indexes.

Isolation Levels[04]

BlazingSQL reads directly from immutable files.

Joins[18][19]

Hash Join

BlazingSQL supports transformations and hash joins (left, left-outer, full-outer) on all the column types supported by rapids.ai.

Logging[04]

Not Supported

BlazingSQL does not write data.

Query Compilation

Not Supported

BlazingSQL does not appear to currently do query compilation.

Query Execution[20]

Vectorized Model

BlazingSQL operations are vectorized on the GPU (SIMD).

Query Interface[10]

SQL

BlazingSQL exposes a Python connector for executing SQL commands.

Storage Architecture[04][05]

In-Memory

BlazingSQL caches the data which is read from the data lake. The cache is cascading, storing data in GPU memory, GPU memory, and finally SSD/NVME.

Storage Model[14][21]

Decomposition Storage Model (Columnar)

BlazingSQL does not write data. It reads compressed data directly from the data lake and transmits relevant columns to the GPU. On the GPU, data is represented as a GPU DataFrame (GDF). GDFs are built on top of Apache Arrow, which is a columnar in-memory format. Ultimately, it relies on external storage.

Storage Organization[22]

BlazingSQL does not write data.

Stored Procedures[18]

Not Supported

Unsupported.

System Architecture[23][24][25][26]

Shared-Nothing

BlazingSQL can utilize multiple GPUs distributed across different servers. BlazingSQL worker nodes push information to each other whenever required.

Views[27]

Virtual Views Materialized Views

BlazingSQL supports both virtual and materialized views. Materialized views are currently not persistent.

Citations

28 sources

https://blazingdb.com blazingdb.com Dead — Check Archive Accessed: 2026-06-03
https://docs.blazingdb.com blazingdb.com Dead — Check Archive Accessed: 2026-06-05
Technology Wallpaper livewallpapers.com Accessed: 2026-06-07
In all honesty we get very few questions about ACID compliance from users and cu... | Hacker News ycombinator.com Dead — Check Archive Accessed: 2026-06-07
https://blog.blazingdb.com/announcing-blazingsql-a-gpu-sql-engine-for-rapids-open-source-software-from-nvidia-11e115ba7dd7 blazingdb.com Dead — Check Archive Accessed: 2026-05-23
https://blog.blazingdb.com/blazingdb-origins-oh-and-we-just-raised-2-9m-from-nvidia-and-samsung-99cd581e66c7 blazingdb.com Dead — Check Archive Accessed: 2026-05-23
https://blog.blazingdb.com/tcdrisupt-the-database-dabf044178ce blazingdb.com Dead — Check Archive Accessed: 2026-05-23
https://www.linkedin.com/in/roaramburu/ linkedin.com Accessed: 2026-05-23
https://www.linkedin.com/in/felipe-aramburu-707a5b48/ linkedin.com Accessed: 2026-05-23
https://blazingdb.atlassian.net/wiki/spaces/BlazPub/overview atlassian.net Dead — Check Archive Accessed: 2026-05-23
https://news.ycombinator.com/item?id=15840900 ycombinator.com Dead — Check Archive Accessed: 2026-05-23
SolidWorks 2013 Solution Overview nvidia.com Modified: 2017-08-30 Accessed: 2026-06-07
https://news.ycombinator.com/item?id=15820091 ycombinator.com Dead — Check Archive Accessed: 2026-05-23
https://news.ycombinator.com/item?id=12485967 ycombinator.com Dead — Check Archive Accessed: 2026-05-23
https://blog.blazingdb.com/blazingdb-2-0-gpu-fast-sql-on-apache-parquet-f2e8eff1f77a blazingdb.com Dead — Check Archive Accessed: 2026-05-23
https://news.ycombinator.com/item?id=18201604 ycombinator.com Dead — Check Archive Accessed: 2026-05-23
https://news.ycombinator.com/item?id=13992328 ycombinator.com Dead — Check Archive Accessed: 2026-05-23
https://docs.blazingdb.com/docs/blazingdb-sql-guide blazingdb.com Dead — Check Archive Accessed: 2026-05-23
https://news.ycombinator.com/item?id=12486060 ycombinator.com Dead — Check Archive Accessed: 2026-05-23
https://news.ycombinator.com/item?id=12488062 ycombinator.com Dead — Check Archive Accessed: 2026-05-23
GitHub - rapidsai/cudf: cuDF - GPU DataFrame Library · GitHub github.com Accessed: 2026-05-23
https://news.ycombinator.com/item?id=13990901 ycombinator.com Dead — Check Archive Accessed: 2026-05-23
https://docs.blazingdb.com/discuss/57e2544bcda3750e0054a7e8 blazingdb.com Dead — Check Archive Accessed: 2026-05-25
https://youtu.be/tUIrR_mj9fQ?t=2194 youtu.be Accessed: 2026-05-23
https://news.ycombinator.com/item?id=18201881 ycombinator.com Dead — Check Archive Accessed: 2026-05-23
https://news.ycombinator.com/item?id=18201738 ycombinator.com Dead — Check Archive Accessed: 2026-05-23
https://docs.blazingdb.com/docs/database-administration blazingdb.com Accessed: 2026-05-23
https://www.crunchbase.com/organization/blazing-db crunchbase.com Accessed: 2026-05-20

Revision #15 Last Updated: 2018-12-12 20:04