HStreamDB

Viewing Revision #4 from 2023-05-07 02:04 View Current

HStreamDB is an open source distributed streaming database designed for accessing, storing, and processing real-time streaming data from sources such as IoT devices. All records added to the database are appended to an immutable object called a stream and there can be multiple streams in a database at once. HStreamDB seeks to provide low-latency access to analyses on the most current data in streams, which it achieves by incrementally updating in-memory materialized views in real-time as streaming data is ingested. HStreamDB also provides the ability to consume data from a stream from multiple client consumers through stream subscriptions, which deliver data to the client once it is ingested to the DBMS. HStreamDB allows for SQL queries with extensions for supporting streams, and it was built from scratch with Haskell.

Logo Versions

Website: https://flowmq.io/[01]
Source Code: https://github.com/hstreamdb/hstream[02] Accessed: Jul 27, 2026 Last Commit: Dec 25, 2024
Tech Docs: https://docs.hstream.io[03]
Developer: EMQ Technologies Co.
Country of Origin: CN
Start Year: 2020 [14]
Project Types: Commercial, Open Source
Written in: Haskell
Supported Languages: Go, Java, Python
Embeds / Uses: RocksDB
Operating System: Linux
License: BSD License
Twitter: @HStreamDB[04]

Logo Versions

Website: https://flowmq.io/[01]
Source Code: https://github.com/hstreamdb/hstream[02] Accessed: Jul 27, 2026 Last Commit: Dec 25, 2024
Tech Docs: https://docs.hstream.io[03]
Developer: EMQ Technologies Co.
Country of Origin: CN
Start Year: 2020 [14]
Project Types: Commercial, Open Source
Written in: Haskell
Supported Languages: Go, Java, Python
Embeds / Uses: RocksDB
Operating System: Linux
License: BSD License
Twitter: @HStreamDB[04]

HStreamDB

Viewing Revision #4 from 2023-05-07 02:04 View Current

Streaming

History

HStreamDB is built by EMQ, a company providing open source IoT data infrastructure. It was first open sourced in 2021 and is under active development by the Haskell Team from EMQ.

HStreamDB was developed to incorporate a data-driven model to efficiently process stream data in a database. In contrast to the command-driven model of most databases which analyzes data when the client sends a request, HStreamDB’s goal was to analyze data as it is ingested in real-time and deliver the analyses with low-latency.

Compression[05]

Naïve (Record-Level)

Compression is used to reduce network bandwidth utilization when transferring data to and from the database. Compression and decompression is performed entirely by the client and the compressed data is stored natively in the database. HStreamDB supports both gzip and zstd compression algorithms.

Concurrency Control[05]

Not Supported

Since records can only be appended to streams, and records can be written out of order, concurrency control is not necessary for HStreamDB.

Data Model[05][06]

Relational

HStreamDB models data as records which are written to streams. All records have a unique identifier, and the data in a record can either be an HRecord or a Raw Record. A HRecord can be thought of as a traditional tuple in a database with support for nested maps and arrays. HRecords can be queried using SQL. A Raw Record contains arbitrary binary data which the database does not interpret or query. Raw Records are intended to be consumed from subscriptions.

Foreign Keys

Not Supported

Indexes

Not Supported

Isolation Levels

Not Supported

Joins[07]

Nested Loop Join

HStreamDB supports nested loop joins between two streams and two materialized views. Joins between a stream and materialized view are also supported.

Query Compilation

Not Supported

Query Execution

Tuple-at-a-Time Model

Query Interface[08][09]

Custom API SQL

HStreamDB supports interfacing with the database with either SQL or its custom API. It uses a SQL dialect that is a subset SQL-92 with extensions to support stream operations. Queries can be executed from a command line interface and the Java, Go, and Python clients. HStreamDB’s custom API is implemented in its clients and can be used to insert and consume data.

Since HStreamDB is a streaming database, it handles queries differently from a typical database. Queries are treated as running tasks that fetch data from streams and produce results continuously as the streams are updated. HStreamDB also supports subscriptions, where multiple consumers can read data in real-time from a single stream as records are added by producers.

Storage Architecture[10]

Disk-oriented

HStreamDB is a disk-oriented database that uses the RocksDB storage engine. This allows HStreamDB to support large scale data streams.

Storage Model[10]

Custom

HStreamDB does not implement its own storage layer, and instead relies on RocksDB as a key-value store. All of its data is eventually processed and stored in the key-value file format implemented in RocksDB.

Stored Procedures

Not Supported

System Architecture[10][11][12]

HStreamDB is based on a shared nothing architecture. Each of the nodes in a deployment are identical and contain two components, the HStream Server and the HStream Storage layer.

The HStream Server is responsible for parsing and executing SQL queries as well as any other computations required by a connected client.

The HStream Storage layer is a distributed storage system that uses RockDB as a key-value store for all of the streaming data. Streams are stored across multiple nodes in shards, and shards can be replicated using Paxos. This layer is also responsible for pushing data to consumers with stream subscriptions.

Views[13]

Materialized Views

HStreamDB supports incrementally updated materialized views. As data is added to streams, views are updated in real-time. This makes querying views fast since they always contain the latest result. Views are different from streams since they are only stored in memory.

Citations

14 sources

https://flowmq.io/ flowmq.io Accessed: 2026-07-16
GitHub - hstreamdb/hstream: HStreamDB is an open-source, cloud-native streaming database for IoT and beyond. Modernize your data stack for real-time applications. · GitHub github.com Accessed: 2026-06-04
Introduction to HStreamDB | HStream Docs hstream.io Accessed: 2026-06-05
https://twitter.com/HStreamDB twitter.com
https://docs.hstream.io/write/write.html hstream.io Dead — Check Archive Accessed: 2026-06-02
Concepts | HStream Docs hstream.io Accessed: 2026-06-02
SELECT (Stream) | HStream Docs hstream.io Accessed: 2026-06-02
SQL Overview | HStream Docs hstream.io Accessed: 2026-06-02
HStream CLI | HStream Docs hstream.io Accessed: 2026-06-02
HStream Storage (HStore) | HStream Docs hstream.io Accessed: 2026-06-02
Architecture Overview | HStream Docs hstream.io Accessed: 2026-06-02
HStream Server | HStream Docs hstream.io Accessed: 2026-06-02
CREATE VIEW | HStream Docs hstream.io Accessed: 2026-06-02
Initial commit github.com Modified: 2020-08-31 Accessed: 2026-06-01

Revision #4 Last Updated: 2023-05-06 22:04