BlazingSQL is a distributed GPU-accelerated SQL engine with data-lake integration. It is ACID-compliant. BlazingSQL targets ETL workloads and aims to perform efficient read IO and OLAP querying. BlazingDB refers to the company and BlazingSQL refers to the product. It is currently under active development with 15 employees. BlazingDB has offices in San Franscisco and Peru.
BlazingSQL started as a GPU table joiner for multi-terabyte databases. The Aramburu brothers, Rodrigo and Felipe, founded a company in 2013 that provided analytical solutions and needed to speed up joins for pension fraud detection. The system is closed-source with a free community binary. It integrates with the open-source open GPU data science initiative, RAPIDS, which relies on NVIDIA GPUs.
Decomposition Storage Model (Columnar)
BlazingSQL does not write data. It reads compressed data directly from the data lake and transmits relevant columns to the GPU. On the GPU, data is represented as a GPU DataFrame (GDF). GDFs are built on top of Apache Arrow, which is a columnar in-memory format.
Virtual Views Materialized Views
BlazingSQL supports both virtual and materialized views. Materialized views are currently not persistent.
BlazingSQL does not write data. It reads directly from the data lake, loading it into GPU data frames that can be shared with interprocess communication. BlazingSQL handles concurrency for the generation of result sets. The user is responsible for ensuring that the data is in a good state when it is queried.