ClickHouse is developed by a Russian company called Yandex. It is designed for multiple projects within Yandex. Yandex needed a DBMS to analyze large amounts of data, thus they began to develop their own column-oriented DBMS. The prototype of ClickHouse appeared in 2009 and it was released in 2016.
ClickHouse does not support multi-statement transactions.
ClickHouse supports primary key indexes. The index mechanism is called sparse index. In the MergeTree, data are sorted by primary key lexicographically in each part. Then ClickHouse selects some marks every index_granualarity rows. These marks are served as sparse indexes, which allows efficient range queries.
ClickHouse only supports hash join, which is done by placing right part of data in a hash table in memory. Hash join is faster but require enough memory.
ClickHouse supports runtime code generation. The code is generated for every kind of query on the fly, removing all indirection and dynamic dispatch. Runtime code generation can be better when it fuses many operations together and fully utilizes CPU execution units.
ClickHouses provides two types of parsers: a full SQL parser and a data format parser. It uses SQL parser for all types of queries and the data format parser only for INSERT queries. Beyond the query language, it provides multiple user interfaces, including HTTP interface, JDBC driver, TCP interface, command-line client, etc.
ClickHouse has multiple types of table engines. The type of the table engine determines where the data is stored, concurrent level, whether indexes are supported and some other properties. The table engines that store data on disks include TinyLog and Log. The Memory engine stores data in memory and this table engine is mainly used for temporary tables with external query data. The data of Memory engine will disapper after the server is restarted.
ClickHouse is a column-oriented DBMS and it stores data by columns.
ClickHouse system is a cluster of shards. It uses asynchronous multimaster replication and there is no single point of contention across the system.
Commercial, Open Source