Kdb+

Acquired Company

Kdb+ is a column-based relational time series database, developed by Kx Systems. Kdb+ database is designed to be used in financial area to store time series data and scale up/out when data increases.

History

In 1998, Kx Systems released kdb. Kx Systems then released kdb+ as the 64-bit version in 2003. It is written in q language. Kdb+ is built to process large volume of time-series data in areas including finance and IoT.

Query Execution

Vectorized Model

Kdb+ is written in q language and it's vector-based. Each function/operation in the query plan manipulates array/vector data.

Query Interface

SQL HTTP / REST

Kdb+ also supports ODBC/JDBC query interfaces.

Compression

Naïve (Record-Level)

Kdb+ supports on-disk compression with following algorithms:

  • kdb+ algorithm: default compression algorithm

  • gzip: supports different level of compression, larger compression rate needs more computation time

  • Google Snappy: time performance is better but compression rate is lower compared with previous two algorithms

Concurrency Control

Deterministic Concurrency Control

Kdb+ uses partition-based timestamp ordering. Each transaction gets their timestamp at the begin. And on each partition, transactions are executed in order of their timestamp.

Data Model

Relational

Kdb+ uses relational model. One big problem to apply relational model in time series database is to handle large data set. Kdb+ supports on-disk compression to hold more data on single machine and data partitioning to distribute data among different machines.

Foreign Keys

Supported

Kdb+ supports referential integrity.

Indexes

B+Tree

Kdb+ supports both primary and secondary indexes.

Isolation Levels

Serializable

Kdb+ only supports SERIALIZABLE isolation level. This is achieved by using deterministic concurrency control (partition-based). Transactions get their timestamp and execute in order on each partition.

Joins

Hash Join Semi Join

Kdb+ supports sql standard joins. It also supports as-of join and window join.

Logging

Physical Logging

Kdb+ uses physical logging and WAL. In-memory event-engine will log new data tolog file to ensure durability.

Storage Architecture

Hybrid

Kdb+ has both in-memory and on-disk storage. New data is held in memory and old data is flushed to disk. The flush is controlled by event-engine. By default, event-engine will flush in-memory data to disk at daily basis. Rationale behind this design is the system wants to keep everyday new data in memory for fast query.

Storage Model

Decomposition Storage Model (Columnar)

Kdb+ uses DSM both for in-memory and on-disk storage.

Stored Procedures

Supported

Kdb+ supports user to write and store UDF in q language in addition to built-in functions.

System Architecture

Shared-Nothing

Kdb+ uses Lambda architecture on each single node. It has the following properties:

  • Data currently using stores in memory, while historical data is stored on disk.

  • New data come in from streaming sources.

  • Event-engine distribute data to downstream subscribers, including real-time database engine and streaming query engine.

  • Real-time database projects its content down to on-disk historical database for analytic use at daily basis, controlled by event-engine.

Kdb+ Logo
Website

https://kx.com/

Tech Docs

https://code.kx.com/

Developer

Kx Systems

Country of Origin

US

Start Year

1998

Former Name

kdb

Acquired By

First Derivatives plc

Project Type

Commercial

Supported languages

C, C#, C++, Go, Java, JavaScript, Lua, Matlab, Perl, PHP, Python, R, Scala

Operating Systems

Linux, OS X, Solaris, Windows

Licenses

Proprietary

Wikipedia

https://en.wikipedia.org/wiki/Kdb%2B