BlazingSQL is a distributed GPU-accelerated SQL engine with data-lake integration. It is ACID-compliant. BlazingSQL targets ETL workloads and aims to perform efficient read IO and OLAP querying. BlazingDB refers to the company and BlazingSQL refers to the product. It is currently under active development with 15 employees. BlazingDB has offices in San Franscisco and Peru.[03][04][05]
- Website
- https://blazingdb.com[01]
- Tech Docs
- https://docs.blazingdb.com[02]
- Developer
- Country of Origin
- PE
- Start Year
- 2015 [27]
- Project Type
- Commercial
- Written in
- C++
- Supported Languages
- SQL
- Operating System
- Linux
- License
- Proprietary
BlazingSQL is a distributed GPU-accelerated SQL engine with data-lake integration. It is ACID-compliant. BlazingSQL targets ETL workloads and aims to perform efficient read IO and OLAP querying. BlazingDB refers to the company and BlazingSQL refers to the product. It is currently under active development with 15 employees. BlazingDB has offices in San Franscisco and Peru.[03][04][05]
History[06][07][08][09]
BlazingSQL started as a GPU table joiner for multi-terabyte databases. The Aramburu brothers, Rodrigo and Felipe, founded a company in 2013 that provided analytical solutions and needed to speed up joins for pension fraud detection. The system is closed-source with a free community binary. It integrates with the open-source open GPU data science initiative, RAPIDS, which relies on NVIDIA GPUs.
Checkpoints
It is unclear if BlazingSQL supports checkpointing.
Compression[10][11][12][13][14]
BlazingSQL supports compressing and decompressing directly on the GPU. It accepts a variety of input formats such as Apache Parquet, BlazingDB Simpatico (GPU-compressed distributed files), and GDF (GPU dataframes built on Apache Arrow). Data is then sent to the GPU compressed. It is able to operate directly on compressed data.
Concurrency Control[04]
BlazingSQL supports snapshot isolation, which is most likely achieved with MVCC.
Data Model[15]
BlazingSQL is a relational database. It accepts multiple in-memory formats (e.g. Apache Parquet) and provides a SQL interface for querying the data.
Foreign Keys
It is unclear if foreign keys are supported by BlazingSQL.
Hardware Acceleration[16][17]
BlazingSQL is hardware-accelerated with NVIDIA GPUs. Relevant columnar data is compressed, cached and sent to the GPU. The GPUs are used to speed up transforms, predicates, running predicates while skipping metadata, and to perform accelerated joins.
Isolation Levels[04]
BlazingSQL supports Snapshot Isolation, it is unclear if other options are supported.
Joins[18][19]
BlazingSQL supports hash joins, e.g. on strings. It is not clear what other join types are supported.
Logging[04]
When importing data, BlazingSQL always writes it to disk, compresses it and has it in a query-ready state.
Storage Model[14][21]
BlazingSQL is a column-store. To execute a query, it compresses and transmits relevant columns to the GPU. On the GPU, data is represented as a GPU DataFrame (GDF). GDFs are built on top of Apache Arrow, which is a columnar in-memory format.
System Architecture[23][24][25]
BlazingSQL worker nodes push information to each other whenever required. There is a notion of a distributed cache, and nodes can ask each other for cached data-lake data.
Views[26]
BlazingSQL 1.3 supported the CREATE VIEW command. It is unclear if the views are virtual or materialized.
Citations
27 sources- https://blazingdb.com blazingdb.com
- https://docs.blazingdb.com blazingdb.com
- Technology Wallpaper livewallpapers.com
- In all honesty we get very few questions about ACID compliance from users and cu... | Hacker News ycombinator.com
- https://blog.blazingdb.com/announcing-blazingsql-a-gpu-sql-engine-for-rapids-open-source-software-from-nvidia-11e115ba7dd7 blazingdb.com
- https://blog.blazingdb.com/blazingdb-origins-oh-and-we-just-raised-2-9m-from-nvidia-and-samsung-99cd581e66c7 blazingdb.com
- https://blog.blazingdb.com/tcdrisupt-the-database-dabf044178ce blazingdb.com
- https://www.linkedin.com/in/roaramburu/ linkedin.com
- https://www.linkedin.com/in/felipe-aramburu-707a5b48/ linkedin.com
- https://blazingdb.atlassian.net/wiki/spaces/BlazPub/overview atlassian.net
- https://news.ycombinator.com/item?id=15840900 ycombinator.com
- SolidWorks 2013 Solution Overview nvidia.com
- https://news.ycombinator.com/item?id=15820091 ycombinator.com
- https://news.ycombinator.com/item?id=12485967 ycombinator.com
- https://blog.blazingdb.com/blazingdb-2-0-gpu-fast-sql-on-apache-parquet-f2e8eff1f77a blazingdb.com
- https://news.ycombinator.com/item?id=18201604 ycombinator.com
- https://news.ycombinator.com/item?id=13992328 ycombinator.com
- https://docs.blazingdb.com/docs/blazingdb-sql-guide blazingdb.com
- https://news.ycombinator.com/item?id=12486060 ycombinator.com
- https://news.ycombinator.com/item?id=12488062 ycombinator.com
- GitHub - rapidsai/cudf: cuDF - GPU DataFrame Library · GitHub github.com
- https://news.ycombinator.com/item?id=13990901 ycombinator.com
- https://youtu.be/tUIrR_mj9fQ?t=2194 youtu.be
- https://news.ycombinator.com/item?id=18201881 ycombinator.com
- https://news.ycombinator.com/item?id=18201738 ycombinator.com
- https://docs.blazingdb.com/docs/database-administration blazingdb.com
- https://www.crunchbase.com/organization/blazing-db crunchbase.com