Kylin

Viewing Revision #11 from 2019-03-19 00:15 View Current

Kylin is an open source distributed data analytics engine on top of Hadoop/Spark. It offers SQL interface to do OLAP on large datasets. [04][05][06]

Logo Versions

Website: http://www.kylin.io[01]
Source Code: https://github.com/apache/kylin[02] Accessed: Jul 14, 2026 Last Commit: Jul 6, 2026
Tech Docs: https://kylin.apache.org/docs[03]
Developer: eBay Inc.
Country of Origin: CN
Start Year: 2013 [04]
Former Name: KylinOLAP
Project Type: Open Source
Written in: Java
License: Apache v2
Wikipedia: https://en.wikipedia.org/wiki/Apache_Kylin[04]

Unlike massive parallel processing engines like Hive and Presto, Kylin pre-calculates a set of data cubes, stores them in HBase, and directly looks up the results in them. If a query cannot be answered by the data cubes, it will be executed by the underlying process engine. In this way, Kylin is usually used as an accelerator of traditional parallel data processing engines.

Logo Versions

Website: http://www.kylin.io[01]
Source Code: https://github.com/apache/kylin[02] Accessed: Jul 14, 2026 Last Commit: Jul 6, 2026
Tech Docs: https://kylin.apache.org/docs[03]
Developer: eBay Inc.
Country of Origin: CN
Start Year: 2013 [04]
Former Name: KylinOLAP
Project Type: Open Source
Written in: Java
License: Apache v2
Wikipedia: https://en.wikipedia.org/wiki/Apache_Kylin[04]

Kylin

Viewing Revision #11 from 2019-03-19 00:15 View Current

Kylin is an open source distributed data analytics engine on top of Hadoop/Spark. It offers SQL interface to do OLAP on large datasets.

History[04]

The Kylin project was started in 2013, from eBay's R&D in Shanghai, China. It was open sourced on Github as "KylinOLAP" in Oct 2014.

In Nov 2015, Kylin joined Apache Software Foundation incubator;

In Dec 2015, Apache Kylin became a Top Level Project.

Compression[07]

Dictionary Encoding

Kylin applies dictionary encoding to all dimension values in data cubes. Kylin's dictionary is order-preserving and supports mapping both from keys to values and vice versa. The dictionary is implemented as a radix tree. Each node in the radix tree also contains the size of its subtree to support mapping values back to keys.

Besides, Kylin also supports naive compression algorithms in HBase and Hive.

Data Model[08]

Key/Value

Data cubes are essentially HBase tables. Given a dimension column set, Kylin pre-aggregates all possible combinations of their attributes by map-reduce jobs, then encode the dimensions with dictionary encoding. Finally, Kylin encodes all data cubes to Rowkeys in HBase. The format of a Rowkey is cuboid id + attribute. For example, assume a data cube on year and city with cuboid id 00000001, and there is a row year=1994, city=Beijing, sum(sales)=100, and a dictionary maps 1994=0, Beijing=1, there will be an entry in the HBase table Rowkey=00000001+01, value=100.

Foreign Keys[09]

Supported

Kylin supports star schema and snowflake schema. A user needs to specify fact tables and lookup tables before building cubes. Kylin pre-joins the tables when building data cubes.

Joins

Hash Join Sort-Merge Join

On cube building phase, Kylin use Hive to pre-join the fact table and lookup tables.

On query time, table joins are supported by the Apache Calcite query engine.

Query Compilation[10]

Code Generation

The Apache Calcite query engine does code generation for SQL queries.

Query Execution

The Apache Calcite query engine is used to parse, generate and optimize execution plans.

Query Interface[05]

SQL

Kylin supports a subset of Apache Calcite's supported queries. Since Kylin is a pure OLAP engine, it only supports SELECT queries. INSERT, UPDATE and DELETE are not supported.

Storage Architecture[11]

Disk-oriented

Kylin stores the data cubes in HBase, and stores metadata in HBase or MySQL (MySQL metastore is still under test).

Storage Model

Custom

System Architecture[12]

Shared-Disk

Kylin relies on Hive to store raw tables and HBase to store data cubes, both of which store data on HDFS.

Citations

12 sources

http://www.kylin.io kylin.io Dead — Check Archive Accessed: 2026-06-04
GitHub - apache/kylin: Apache Kylin · GitHub github.com Accessed: 2026-06-04
Index of /docs apache.org Accessed: 2026-06-05
Apache Kylin - Wikipedia wikipedia.org Modified: 2026-03-22 Accessed: 2026-06-04
http://kylin.apache.org/docs/gettingstarted/faq.html apache.org Dead — Check Archive Accessed: 2026-05-26
403 Forbidden jianshu.com Dead — Check Archive Accessed: 2026-06-07
http://kylin.apache.org/blog/2015/08/13/kylin-dictionary/ apache.org Dead — Check Archive Accessed: 2026-05-26
编程小梦|Apache Kylin Cube 构建原理 bcmeng.com Accessed: 2026-06-02
http://kylin.apache.org/cn/docs/tutorial/create_cube.html apache.org Dead — Check Archive Accessed: 2026-05-26
Apache Kylin – Cubes on Hadoop | PPTX slideshare.net Accessed: 2026-06-07
http://kylin.apache.org/docs/tutorial/mysql_metastore.html apache.org Dead — Check Archive Accessed: 2026-05-26
https://dbdb.io/db/hbase dbdb.io Accessed: 2026-05-20

Revision #11 Last Updated: 2019-03-18 20:15