Kinetica

Kinetica is a distributed, GPU-accelerated database with filtering, visualization, and aggregation functionality.

History

In 2009, Amit Vij and Nima Neghaban founded GIS Federal, developed a datababse software they called GPUdb. On March 3, 2016, the name of the company was changed to GPUdb to match the name of the software.

Compression

Dictionary Encoding

Kinetica supports data compression by individual column. Dictionary encoding can be applied to individual columns of restricted-length (charN) type, int type, or long type. During the query execution and data modification, Kinetica can temporarily decompressed a copy and discard the copy later.

Data Model

Column Family

Kinetica is column-oriented design that can provides efficient data storage and fast query performance for analytical datasets.

Hardware Acceleration

GPU

Kinetica makes use GPU to perform equijoins (sort-merge), predicate joins (nested loop), fixed-length string processing, aggression/window function and rendering. Because GPUs are good at handling SIMT (single instruction multiple thread) and simple data structure. To make use of GPUs more efficiently, Kinetica encourages data locality to minimize data movement from CPU to GPU.

Indexes

B+Tree Hash Table

Kinetica uses primary key index, relation index and column index to improve data access performance. A primary key index is created by default when a table is created with a primary key specified. The primary key index is hash-based and optimizes the performance of equality-based filter expressions. A relational index is created as the result of applying a foreign key to a column. A column index can be applied to a column in a table or view to improve the performance of operations applied to that column in an expression.The column index is implemented as a b-tree, which provides performance improvements for both equality-based and range-based filter criteria on individual columns. Column indexes can also be applied to the primary key columns.

Joins

Nested Loop Join Sort-Merge Join Index Nested Loop Join

Kinetica supports the SQL concept of joining data sets. Since Kinetica makes use of GPU acceleration, tables being joined together must either be replicated or be sharded on the columns being used to join the tables to avoid data communication. Besides, distributed joins, or joins that connect sharded tables on columns other than their shard keys, are not supported. Join in Kinetican is creating a join view that can be refreshed and future filtering operations.

Logging

Not Supported

Query Compilation

Not Supported

Query Execution

Vectorized Model

Kinetica would do query plan before query execution to build the optimal query. It can also make use of the users' supplied query hints and existed column indexes to improve the query plan. Kinetica applys query to each chunk and chunk results are merged hierarchically to get a final result.

Query Interface

SQL HTTP / REST

Kinetica is an ODBC-compatible database, supporting ANSI SQL-92 compliant syntax. Further, its native API can be accessed via RESTful HTTP endpoints using either JSON or Avro Serialization methods.

Storage Architecture

In-Memory

To optimize throughput and delivery fast query process, Kinetica runs completely in-memory. It make use of RAM and VRAM ( the memory for GPU cards). Hot data would be kept in VRAM to optimize data access and avoid data movement between RAM an VRAM. Kinetica needs a warm phase to load data from disk to memory.

Stored Procedures

Supported

Kinetica supports the concept of user-defined functions (UDF) via a mechanism similar to stored procedures, being a user-defined sequence of operations on a specified data set. Kinetica supports distributed UDF and non-distributed UDF.When distributed, there will be one OS process per processing node in Kinetica. When non-distributed there will only be a single OS process.

System Architecture

Shared-Nothing

Kinetica is a distributed database system. It main node structure is called rank. Since Kinetica is GPU acceleration, each rank is paired with a GPU. The first rank is head rank, which is a HTTP server to receive RESTAPI request from clients, keep metadata. Other ranks are called workers, keep columnar data in-memory and process query on owned GPUs.

Kinetica Logo
Website

https://www.kinetica.com/

Tech Docs

https://www.kinetica.com/docs/index.html

Former Name

GPUdb

Developer

Kinetica DB Inc.

Country of Origin

US

Start Year

2009

Project Type

Commercial

Written in

C, C++

Supported languages

C#, C++, Java, JavaScript, Python

Operating Systems

Linux

Licenses

Proprietary

Wikipedia

https://en.wikipedia.org/wiki/Kinetica_(software)