Kudu

View Current Viewing Revision #4 from 12/12/2018 10 p.m.

Apache Kudu is an open source storage engine for structured data that is part of the Apache Hadoop ecosystem. The primary intention of Kudu is to allow applications to perform fast big data analytics on rapidly changing data. It was designed for fast performance for OLAP queries. Built for distributed workloads, Apache Kudu allows for various partitioning of data across multiple servers. Also, being a part of the Hadoop ecosystem, Kudu is supports the use of Apache data processing frameworks like Spark, Impala or MapReduce on its tables.

History

Prior to Kudu, most data storage engines were able to store one type of structured data, static or mutable. Storage engines for static data were unable to make changes to individual records while storage engines for mutable data had a low throughput for sequential reads. Because of this developers typically used two different storage engines for first mutating their data and then performing analytics. Apache Kudu was designed to support both data formats and provide both high throughput sequential-access and random-access queries. Kudu was developed as internal project at Cloudera and become open to the public in September 2016.

Query Execution

Vectorized Model Materialized Model

Kudu executes both reads and writes on batches of rows which are actually columnar in-memory (as they are on disk). If there are predicates in the query, lazy materialization is used.

Storage Architecture

Disk-oriented

Although, an experimental version of Kudu does rely on persistent memory in a blocked cache, Kudu is primarily disk-oriented.

Logging

Logical Logging

Kudu employs a logical log of operations (insert/update/delete). The operation logs are replicated for each tablet (partitions of the database). An operation log for each tablet is stored separately from the physical data. Operation logs are not able to be replayed or shipped. For single row operations (on a single tablet) Kudu is ACID. However for multi-row operations (on a single tablet), the Atomic property is not fully preserved since a single failed write will not rollback the entire operation, so per-row errors are possible. Multi-tablet operation are currently no possible with Kudu.

Data Model

Relational

Kudu is a relational database. Unlike traditional relation databases, Kudu also utilizes partitioning data into tablets that are stored on individual servers. All rows within a tablet are ordered by a primary key.

Foreign Keys

Not Supported

As of now, Kudu does not support foreign keys.

Concurrency Control

Multi-version Concurrency Control (MVCC)

Kudu employs MVCC. Kudu uses an optimistic concurrency model in which readers don't block writers and writes don't block readers. As a result less lock acquisitions are needed during large table scans.

Query Interface

Custom API

Kudu has No-SQL client APIs for C++, Java and Python. Kudu can also be used with SQL-based query processing interfaces like Hadoop's Impact, MapReduce and Spark.

Checkpoints

Not Supported

Indexes

Not Supported

Currently Kudu does not support any additional indexes (aside from the primary index). As an alternative Kudu provides the capability to partition hash or range partition the data for quicker access.

System Architecture

Shared-Nothing

Kudu is designed for distributed work tasks as a result the system architecture is a shared-nothing master-slave architecture. The database is horizontally partitioned (so each row is in the same partition) into tablets. Each tablet is replicated (typically 3 or 5 times) and each of these instance is stored on its own tablet server. Each tablet server has multiple unique tablets. For each tablet, one of the servers at which the tablet is stored is the leader and the rest of the servers are the followers. Leaders are elected using Raft consensus. Reads for a specific tablet can be done from any one of the tablet servers that store that tablet. Writes are only sent to the leader tablet server and whether the operation is accepted or not is determined by Raft consensus.

Storage Model

Decomposition Storage Model (Columnar)

Because Kudu is designed primarily for OLAP queries a Decomposition Storage Model is used. Kudu's columnar data storage model allows it to avoid unnecessarily reading entire rows for analytical queries. The strong-type (immutable type) requirement for the columns also allow for compression of a single data type.

Compression

Dictionary Encoding Run-Length Encoding Bit Packing / Mostly Encoding Prefix Compression

Each column in a Kudu table can be encoded in certain ways based on the type of that column. By default, bit packing is used for various int, double and float column types, run-length encoding is used for bool column types and dictionary-encoding for string or binary column types. By default Kudu doesn't compress columns but it supports per-column compression using LZ4, Snappy or zlib compression codecs.

Joins

Not Supported

Joins just with Kudu are not supported. However when used with a query engine like Impala, Kudu tables can be joined with other tables stored in the Hadoop storage system.

Isolation Levels

Read Committed Snapshot Isolation Repeatable Read

Kudu allows for the user to set two different read operation modes through the Kudu API clients. By default, Kudu uses the 'Read Committed' isolation level. For the 'Snapshot Isolation' level the timestamp can either be set explicitly by the user or assigned by a server. The 'Snapshot Isolation' read level allows for consistent and repeatable reads.

Revision #4 | Updated 12/12/2018 10 p.m.