Umbra

OLAP OLTP

Umbra is a relational DBMS designed to support high-performance for OLAP and OLTP workloads using flash-based storage. Umbra provides the performance of a pure main-memory DBMS for workloads that fit within main-memory, with the scalability of a disk-based system. Umbra is an evolution of HyPer with several new additions such as:

  • A LeanStore-based buffer manager with variable-sized pages
  • Low-latency query compilation
  • Integration of Worst-Case Optimal Joins
  • Efficient Statistics Maintenance
  • Continuous Views
  • Efficient String Handling
  • User-Defined Operators (UDOs)
  • ArrayQL Support
  • Support for Apache Parquet Files

History

Umbra is the new system built at TUM after the HyPer project.

Data Model

Relational

Foreign Keys

Supported

Indexes

B+Tree

Isolation Levels

Serializable

Query Compilation

JIT Compilation

Umbra performs Just-In-Time (JIT) compilation of queries into Umbra IR, a custom intermediate representation (IR) similar to LLVM IR but optimized for use in a database system. After generating Umbra IR, the code is lowered using one of two backends:

  1. LLVM
  2. Flying Start

The LLVM backend emits LLVM IR, compiled at optimization level -O3. This backend is the slowest but generates the fastest executing code, making it suitable for long-running queries.

The Flying Start backend emits x86 machine code using asmJIT, generating x86 in a single pass. In addition, the Flying Start backend implements Stack Space Reuse, Machine Register Allocation, Lazy Address Calculation, and Comparison-Branch Fusion optimizations. As a result, the code generated by Flying Start has performance on par with code generated by LLVM -O0 (i.e., with optimizations disabled). Additionally, Flying Start outperforms interpretation of Umbra IR, making Flying Start suitable for all but the longest-running queries.

Umbra supports adaptive execution, pioneered by HyPer, allowing the DBMS to switch execution strategies while processing a single query. Umbra first generates x86 machine code using the Flying Start backend and then switches to the code generated by the LLVM backend for long-running queries.

Query Execution

Tuple-at-a-Time Model

Umbra uses data-centric query processing, executing operators tuple-at-a-time.

Query Interface

SQL

Storage Architecture

Hybrid

Umbra Logo
Website

https://umbra-db.com/

Developer

Technische Universität München

Country of Origin

DE

Start Year

2018

Project Type

Academic

Written in

C++

Derived From

HyPer

Embeds / Uses

LeanStore

Compatible With

PostgreSQL

Operating Systems

Linux

Licenses

Proprietary