DBDB.io The Encyclopedia of Database Systems · Est. 2017
Database of Databases

Database Entry

ClickHouse


CilckHouse is an open-source column-oriented OLAP DBMS. It is designed to provide linear scalability of queries.[04][01]

Source Code
https://github.com/ClickHouse/ClickHouse[02]
Country of Origin
RU
Start Year
2016
Project Types
Commercial, Open Source
Written in
C++
Supported Languages
C, C#, C++, Go, Java, Kotlin, Scala
Compatible With
MySQL
Operating System
Linux
License
Apache v2

Database Entry

ClickHouse


CilckHouse is an open-source column-oriented OLAP DBMS. It is designed to provide linear scalability of queries.[04][01]

History[04]


ClickHouse is developed by a Russian company called Yandex. It is designed for multiple projects within Yandex. Yandex needed a DBMS to analyze large amounts of data, thus they began to develop their own column-oriented DBMS. The prototype of ClickHouse appeared in 2009 and it was released to open-source in 2016.

Checkpoints


ClickHouse doesn't support transactions.

Compression[05][06]


In addition to general-purpose encoding with LZ4 (default) or Zstd, ClickHouse supports dictionary encoding via LowCardinality data type, as well as delta, double-delta and Gorilla encodings via column codecs.

Concurrency Control


ClickHouse does not support multi-statement transactions.

Data Model[07]


ClickHouse uses the relational database model.

Foreign Keys


ClickHouse does not support foreign keys.

Indexes[08]


ClickHouse supports primary key indexes. The indexing mechanism is called a sparse index. In the MergeTree, data are sorted by primary key lexicographically in each part. Then ClickHouse selects some marks for every Nth row, where N is chosen adaptively by default. Together these marks serve as a sparse index, which allows efficient range queries.

Isolation Levels


ClickHouse doesn't support transactions and isolation levels.

Joins[09]


ClickHouse uses hash join by default, which is done by placing the right part of data in a hash table in memory. If there's not enough memory for hash join it falls back to merge join.

Logging[10]


ClickHouse replicates its data on multiple nodes and monitors data synchronicity on replicas. It recovers after failures by syncing data from other replica nodes.

Parallel Execution[11]


ClickHouse utilizes half cores for single-node queries and one replica of each shard for distributed queries by default. It could be tuned to utilize only one core, all cores of the whole cluster or anything in between.

Query Compilation[01][12]


ClickHouse supports runtime code generation. The code is generated for every kind of query on the fly, removing all indirection and dynamic dispatch. Runtime code generation can be better when it fuses many operations together and fully utilizes CPU execution units.

Query Execution[13]


Query Interface[14][15]


ClickHouses provides two types of parsers: a full SQL parser and a data format parser. It uses SQL parser for all types of queries and the data format parser only for INSERT queries. Beyond the query language, it provides multiple user interfaces, including HTTP interface, JDBC driver, TCP interface, command-line client, etc.

Storage Architecture


ClickHouse has multiple types of table engines. The type of the table engine determines where the data is stored, concurrent level, whether indexes are supported and some other properties. Key table engine family for production use is a MergeTree that allows for resilient storage of large volumes of data and supports replication. There's also a Log family for lightweight storage of temporary data and Distributed engine for querying a cluster.

Storage Model[01]


ClickHouse is a column-oriented DBMS and it stores data by columns.

Storage Organization


Stored Procedures[16][17]


Currently, stored procedures and UDF are listed as open issues in ClickHouse.

System Architecture[04][12]


ClickHouse system in a distributed setup is a cluster of shards. It uses asynchronous multimaster replication and there is no single point of contention across the system.

Views[18]


ClickHouse supports both virtual views and materialized views. The materialized views store data transformed by corresponding SELECT query. The SELECT query can contain DISTINCT, GROUP BY, ORDER BY, LIMIT, etc.

Citations

23 sources
  1. Fast Open-Source OLAP DBMS - ClickHouse clickhouse.com
  2. GitHub - ClickHouse/ClickHouse: ClickHouse® is a real-time analytics database management system · GitHub github.com
  3. ClickHouse Docs | ClickHouse Docs clickhouse.com
  4. ClickHouse - Wikipedia wikipedia.org
  5. https://presentations.clickhouse.com/meetup19/string_optimization.pdf clickhouse.com Dead — Check Archive
  6. CREATE Queries | ClickHouse Docs clickhouse.com
  7. https://clickhouse.yandex/docs/en clickhouse.yandex Dead — Check Archive
  8. https://medium.com/@f1yegor/clickhouse-primary-keys-2cf2a45d7324 medium.com Dead — Check Archive
  9. Скорость и оптимизация JOIN 1 к 1 и Materialized view google.com
  10. https://clickhouse.yandex/reference_en.html clickhouse.yandex Dead — Check Archive
  11. Session Settings | ClickHouse Docs clickhouse.com
  12. https://github.com/ClickHouse/ClickHouse/blob/master/doc/developers/architecture.md github.com Dead — Check Archive
  13. Architecture Overview | ClickHouse Docs clickhouse.com
  14. Formats for input and output data | ClickHouse Docs clickhouse.com
  15. HTTP interface | ClickHouse Docs clickhouse.com
  16. https://github.com/yandex/ClickHouse/issues/11 github.com Dead — Check Archive
  17. https://github.com/yandex/ClickHouse/issues/32 github.com Dead — Check Archive
  18. CREATE Queries | ClickHouse Docs clickhouse.com
  19. https://github.com/ClickHouse/ClickHouse/commit/401bc6c8948aafec9d012dc420984876161f259e github.com
  20. https://github.com/ClickHouse/ClickHouse/commit/5000f02cef8addc8e5a96304416c7aa22dff8646 github.com
  21. https://github.com/ClickHouse/ClickHouse/commit/d4d96a7da49187ef63868b01a411bdcb740b125f github.com
  22. https://github.com/ClickHouse/ClickHouse/commit/d4c2c114d68971cb9c3351f32dc0bac47314ef9a github.com
  23. https://github.com/ClickHouse/ClickHouse/commit/c64cfbcf4fec90c3acbb3e1f0ec27ee3df04130e github.com
Revision #17 Last Updated: