Heroic

Viewing Revision #11 from 2019-11-24 06:15 View Current

Heroic is an open-source times-series DBMS built at Spotify.

Logo Versions

Website: https://spotify.github.io/heroic/[01]
Source Code: https://github.com/spotify/heroic[02] Accessed: Jun 8, 2026 Archived: Mar 26, 2021
Tech Docs: https://spotify.github.io/heroic/#!/docs/overview[03]
Developer: Spotify
Country of Origin: SE
Start Year: 2014 [17]
End Year: 2019 [18]
Project Type: Open Source
Written in: Java
Supported Languages: Java, Python
Derived From: Cassandra, Elasticsearch, ksqlDB
Operating System: All OS with Java VM
License: Apache v2

Logo Versions

Website: https://spotify.github.io/heroic/[01]
Source Code: https://github.com/spotify/heroic[02] Accessed: Jun 8, 2026 Archived: Mar 26, 2021
Tech Docs: https://spotify.github.io/heroic/#!/docs/overview[03]
Developer: Spotify
Country of Origin: SE
Start Year: 2014 [17]
End Year: 2019 [18]
Project Type: Open Source
Written in: Java
Supported Languages: Java, Python
Derived From: Cassandra, Elasticsearch, ksqlDB
Operating System: All OS with Java VM
License: Apache v2

Heroic

Viewing Revision #11 from 2019-11-24 06:15 View Current

Heroic is an open-source times-series DBMS built at Spotify.

Data Model[04]

Key/Value

Heroic uses a key/value data model, where each key is comprised of a “unique set of tags and resource identifiers” that correspond to a single series. In this context, we define tags as the database data that can be indexed and will be retained within the database. Additionally, each tag also has its corresponding-time series stored with the data. Tags are thus used in complex queries for both filtering and aggregations, as described by the GitHub Documentation. On the other hand, a Resource Identifier is data that cannot be indexed. However the data itself is still stored with this corresponding-time series. Thus, the purpose of resource identifiers itself is to ensure that data which is constantly changing can still be stored and accessed as per its time-series. As the GitHub documentation gives as example, if the hostname field were to change often, rather than retaining the field, for the purpose of maintaining time-series data as the documentation describes, we would keep hostname as a Resource Identifier and not a tag. As such, resource identifiers are used for querying based off of aggregations.

Foreign Keys[05]

Not Supported

Indexes[06][07][08]

Inverted Index (Full Text)

The Elasticsearch DB is used by Heroic to Index all of its data. Thus, the indexing structure of heroic mirrors that of Elasticsearch DB, and is an inverted index. The benefits of this type of index is that upon conducting searching, it looks through all possible documents to find unique instances of words, thereby storing each unique words and all the instances in which that word was used. This also enables more contextual searches (i.e. searches which provide the resulting documents as well), and results in faster queries overall.

Storage Architecture[06][09][10][11]

Disk-oriented

Because Heroic uses Cassandra as its primary form of storage, we will assume that Heroic’s Storage Architecture is modeled off of Cassandra’s as well. Cassandra is a disk-oriented database, as data in Cassandra is stored in the format of columns. However the columns itself are stored on disk. This works such that each column on disk corresponds to a different data feature, from which the columns are comprised represent different data points stored.

Storage Model[12]

N-ary Storage Model (Row/Record)

Similar to what was discussed before regarding Cassandra being Heroic’s primary storage mechanism, Heroic also takes on the storage model of Cassandra implying that Heroic has an n-nary storage model as well. An n-nary storage model means that all related data is stored tables where the table has “n” columns, thus defining the n-nary relationship.

Storage Organization[13][14]

Log-structured

Likewise, the storage organization will also model that of Cassandra’s, being log-structured, that is, utilizing a log structured merge tree. By definition, a log-structured merge tree (LSM) tree is a key-value based tree that performs well with regards to inserting in files to which large quantities of data are inserted. Additionally, LSM trees can have multiple data structures building up the tree that priorities different storage as with the two-level LSM tree where one structure has data from memory and the other has data from disk such that data can flow across the two structures. The data from an LSM tree is sorted into run where each run is sorted by a key. For Cassandra, one key can map to multiple values which correspond to multiple data rows, and thus upon searching the tree we would have to get all corresponding values.

Stored Procedures[15][16][11]

Not Supported

Cassandra, the primary storage model for Heroic does not have stored procedures. Rather, logic is more placed on the application-side, by making a client or application-level program through which users can request to "load and store data" contained inside the Cassandra DB.

Citations

18 sources

Heroic Documentation github.io Modified: 2021-03-26 Accessed: 2026-06-04
GitHub - spotify/heroic: The Heroic Time Series Database · GitHub github.com Accessed: 2026-05-28
Heroic Documentation github.io Modified: 2021-03-26 Accessed: 2026-06-05
Heroic Documentation github.io Modified: 2021-03-26 Accessed: 2026-05-28
https://db-engines.com/en/system/Heroic;InfluxDB db-engines.com Accessed: 2026-05-28
Monitoring at Spotify: Introducing Heroic | Spotify Engineering atspotify.com Accessed: 2026-05-28
Index fundamentals | Elastic Docs elastic.co Modified: 2026-06-03 Accessed: 2026-06-07
https://medium.com/elasticsearch/what-happens-when-a-document-is-indexed-in-elasticsearch-16b7ae3415bc medium.com Dead — Check Archive Accessed: 2026-05-28
nosql - Why many refer to Cassandra as a Column oriented database? - Stack Overflow stackoverflow.com Accessed: 2026-06-07
nosql - Is Cassandra a column oriented or columnar database - Stack Overflow stackoverflow.com Accessed: 2026-06-07
https://dbdb.io/db/cassandra dbdb.io Accessed: 2026-05-28
https://dbdb.io/db/cassandra/revisions/3 dbdb.io Accessed: 2026-05-28
Storage engine datastax.com Modified: 2024-03-21 Accessed: 2026-06-07
Log-structured merge-tree - Wikipedia wikipedia.org Modified: 2026-03-26 Accessed: 2026-06-14
Stored Procedure with Cassandra google.com Accessed: 2026-06-14
Is there any concept of stored procedures in Cassandra? - Stack Overflow stackoverflow.com Accessed: 2026-06-07
initial commit github.com Modified: 2015-06-03 Accessed: 2026-05-21
Fallback to DOCKER_TAG when setting version. (#585) github.com Modified: 2019-11-20 Accessed: 2026-05-28

Revision #11 Last Updated: 2019-11-24 01:15