DBDB.io The Encyclopedia of Database Systems · Est. 2017
Database of Databases

Database Entry

BlazingSQL


BlazingSQL is a distributed GPU-accelerated SQL engine with data-lake integration. It is ACID-compliant. BlazingSQL targets ETL workloads and aims to perform efficient read IO and OLAP querying. BlazingDB refers to the company and BlazingSQL refers to the product. It is currently under active development with 15 employees. BlazingDB has offices in San Franscisco and Peru.[03][04][05]

Developer
Country of Origin
PE
Start Year
2015 [27]
Project Type
Commercial
Written in
C++
Supported Languages
SQL
Operating System
Linux
License
Proprietary

Database Entry

BlazingSQL


BlazingSQL is a distributed GPU-accelerated SQL engine with data-lake integration. It is ACID-compliant. BlazingSQL targets ETL workloads and aims to perform efficient read IO and OLAP querying. BlazingDB refers to the company and BlazingSQL refers to the product. It is currently under active development with 15 employees. BlazingDB has offices in San Franscisco and Peru.[03][04][05]

History[06][07][08][09]


BlazingSQL started as a GPU table joiner for multi-terabyte databases. The Aramburu brothers, Rodrigo and Felipe, founded a company in 2013 that provided analytical solutions and needed to speed up joins for pension fraud detection. The system is closed-source with a free community binary. It integrates with the open-source open GPU data science initiative, RAPIDS, which relies on NVIDIA GPUs.

Checkpoints


It is unclear if BlazingSQL supports checkpointing.

Compression[10][11][12][13][14]


Historically, BlazingSQL supported compression and decompression on the GPU with bit-packing, delta encoding, dictionary encoding, and run-length encoding. This is currently disabled alongside its custom Simpatico file format. As of November 2018, it operates directly on Apache Parquet, CSV, and ORC. BlazingSQL does not currently write data and instead reads it from the data lake. It is able to operate directly on compressed data.

Concurrency Control[04]


BlazingSQL does not write data. It reads directly from the data lake, loading it into GPU data frames that can be shared with interprocess communication. BlazingSQL handles concurrency for the generation of result sets. The user is responsible for ensuring that the data is in a good state when it is queried.

Data Model[15]


BlazingSQL is a relational database. It accepts multiple in-memory formats (e.g. Apache Parquet) and provides a SQL interface for querying the data.

Foreign Keys


It is unclear if foreign keys are supported by BlazingSQL.

Hardware Acceleration[16][17]


GPU

BlazingSQL is hardware-accelerated with NVIDIA GPUs. Relevant columnar data is compressed, cached and sent to the GPU. The GPUs are used to speed up transforms, predicates, running predicates while skipping metadata, and to perform accelerated joins.

Indexes


BlazingSQL does not appear to support indexes.

Isolation Levels[04]


BlazingSQL reads directly from immutable files.

Joins[18][19]


BlazingSQL supports transformations and hash joins (left, left-outer, full-outer) on all the column types supported by rapids.ai.

Logging[04]


When importing data, BlazingSQL always writes it to disk, compresses it and has it in a query-ready state.

Query Compilation


BlazingSQL does not appear to currently do query compilation.

Query Execution[20]


BlazingSQL operations are vectorized on the GPU (SIMD).

Query Interface[10]


SQL

BlazingSQL exposes a Python connector for executing SQL commands.

Storage Architecture[04]


BlazingSQL loads data to disk, but ultimately operates on the data in GPU.

Storage Model[14][21]


BlazingSQL does not write data. It reads compressed data directly from the data lake and transmits relevant columns to the GPU. On the GPU, data is represented as a GPU DataFrame (GDF). GDFs are built on top of Apache Arrow, which is a columnar in-memory format.

Storage Organization[22]


BlazingSQL appears to be log-structured.

Stored Procedures[18]


As of BlazingSQL 1.3, stored procedures do not appear to be supported.

System Architecture[23][24][25]


BlazingSQL worker nodes push information to each other whenever required. There is a notion of a distributed cache, and nodes can ask each other for cached data-lake data.

Views[26]


BlazingSQL supports both virtual and materialized views. Materialized views are currently not persistent.

Citations

27 sources
  1. https://blazingdb.com blazingdb.com Dead — Check Archive
  2. https://docs.blazingdb.com blazingdb.com Dead — Check Archive
  3. Technology Wallpaper livewallpapers.com
  4. In all honesty we get very few questions about ACID compliance from users and cu... | Hacker News ycombinator.com Dead — Check Archive
  5. https://blog.blazingdb.com/announcing-blazingsql-a-gpu-sql-engine-for-rapids-open-source-software-from-nvidia-11e115ba7dd7 blazingdb.com Dead — Check Archive
  6. https://blog.blazingdb.com/blazingdb-origins-oh-and-we-just-raised-2-9m-from-nvidia-and-samsung-99cd581e66c7 blazingdb.com Dead — Check Archive
  7. https://blog.blazingdb.com/tcdrisupt-the-database-dabf044178ce blazingdb.com Dead — Check Archive
  8. https://www.linkedin.com/in/roaramburu/ linkedin.com
  9. https://www.linkedin.com/in/felipe-aramburu-707a5b48/ linkedin.com
  10. https://blazingdb.atlassian.net/wiki/spaces/BlazPub/overview atlassian.net Dead — Check Archive
  11. https://news.ycombinator.com/item?id=15840900 ycombinator.com Dead — Check Archive
  12. SolidWorks 2013 Solution Overview nvidia.com
  13. https://news.ycombinator.com/item?id=15820091 ycombinator.com Dead — Check Archive
  14. https://news.ycombinator.com/item?id=12485967 ycombinator.com Dead — Check Archive
  15. https://blog.blazingdb.com/blazingdb-2-0-gpu-fast-sql-on-apache-parquet-f2e8eff1f77a blazingdb.com Dead — Check Archive
  16. https://news.ycombinator.com/item?id=18201604 ycombinator.com Dead — Check Archive
  17. https://news.ycombinator.com/item?id=13992328 ycombinator.com Dead — Check Archive
  18. https://docs.blazingdb.com/docs/blazingdb-sql-guide blazingdb.com Dead — Check Archive
  19. https://news.ycombinator.com/item?id=12486060 ycombinator.com Dead — Check Archive
  20. https://news.ycombinator.com/item?id=12488062 ycombinator.com Dead — Check Archive
  21. GitHub - rapidsai/cudf: cuDF - GPU DataFrame Library · GitHub github.com
  22. https://news.ycombinator.com/item?id=13990901 ycombinator.com Dead — Check Archive
  23. https://youtu.be/tUIrR_mj9fQ?t=2194 youtu.be
  24. https://news.ycombinator.com/item?id=18201881 ycombinator.com Dead — Check Archive
  25. https://news.ycombinator.com/item?id=18201738 ycombinator.com Dead — Check Archive
  26. https://docs.blazingdb.com/docs/database-administration blazingdb.com
  27. https://www.crunchbase.com/organization/blazing-db crunchbase.com
Revision #9 Last Updated: