HANA

SAP HANA is a famous column-oriented, in-memory DBMS developed by SAP SE, which can perform both OLTP and OLAP workloads very well. In addition, it also supports multiple advanced analysis such as financial predictions, graph-data processing, as well as text analytics, etc.

History

The early development of SAP HANA was based on TREX search engine, P*Time, and MaxDB. It was mainly designed for the real-time data analytics and aggregation at first. SAP HANA also offers Platform as a service on multiple cloud providers currently. In 2016, SAP HANA 2 was released, which can also support Earth Observation Analysis and Text Analysis, apart from pdatabase and application management.

Stored Procedures

Supported

SAP HANA supports stored procedure features by allowing clients to describe a sequence of data transformations and define as a reusable processing block. A Procedure can be created with SQP HANA SQL queries or Using the Modeler wizard (Modeler and Development perspectives). The stored procedures can be parameterized and reused in another procedure.

Indexes

B+Tree Hash Table

SAP HANA supports multiple indexes, which contain B+ tree, Compressed Prefix-B+ Tree, and Inverted index.

Query Execution

Vectorized Model

SAP HANA leverages the vectorized framework which can evaluate complex predicates and classify them based on the complexity to boost the performances. It works smoothly on the results extraction and unpacking on scan predicates especially for compressed in-memory columns currently.

Storage Architecture

In-Memory

SAP HANA is an in-memory database. It uses In-Memory Computing technologies to store data in memory with a complicated data compression, which will improve the performance by avoiding disk I/O. Permanent storage of the data on disk is still required to achieve fault tolerance and the back-up operations will be executed asynchronously as a background task which will not influence the performance.

Logging

Physical Logging

Similar to other in-memory databases, persistent storage is also used for logging in SAP HANA. During the logging phase, only the real payload without any header is written service-specifically. SAP HANA also supports third-party logging tools.

Concurrency Control

Multi-version Concurrency Control (MVCC)

SAP HANA supports Multi-version Concurrency Control as the default mechanism to ensure data consistency. When one user connects to the database, a snapshot is provided for each user. The changes from one writer can only be seen by others after transactions are committed. No rollback segment is supported for the insert method. Both distributed locking with a global deadlock detection mechanism and distributed snapshot isolation are supported in SAP HANA to achieve synchronization. Moreover, one optional background garbage collection thread is also supported.

Data Model

Column Family / Wide-Column

SAP HANA is based on the columnar structure, where data are stored in columns to minimize the storage footprint, since repeating values are only stored once. It is easier to change other structures to columnar structure in SAP HANA, because its in-memory design is fast.

Views

Virtual Views Materialized Views

SAP HANA supports both materialized and virtual views. When a materialized view is created, the system validates the definition and precomputes the result set from the database, which is stored on disk to improve the performance. SAP HANA also allows virtual views which replace the stored result set by on-the-fly calculation and are derived each time when they are used.

System Architecture

Shared-Nothing

For scalability support, SAP HANA uses multiple servers in one cluster and partitions the data to distribute across the servers with a shared-nothing architecture. The system has three components. The Name servers keep track of the location of data and store information on the topology of the entire system. The Index servers contain the actual data partitions and process the data as required. The last component is Statistics servers which collect information about status, performance and resource consumption from the system.

Isolation Levels

Read Committed

In SAP HANA, the default isolation level is ReadCommited, and the maximum one is Serialization. Snapshot Isolation is not supported by default.

Checkpoints

Fuzzy

SAP HANA supports fuzzy checkpoints, where any updates are still stored in storage snapshots even if transactions are not committed. Extra work is needed to remove those updates. SAP HANA also supports encrypted snapshots when database is encrypted first.

Storage Model

Decomposition Storage Model (Columnar) N-ary Storage Model (Row/Record)

SAP HANA supports both N-ary Storage Model and Decomposition Storage Model. However, it is optimized for DSM as a default storage model in order to provide high performance on a hybrid workloads of analysis and transaction. SAP HANA allows joining row-oriented tables with column-oriented tables and also supports altering the storage model of an existing table.

Query Compilation

JIT Compilation

SAP HANA adopts LLVM JIT as compilation backend for both stored procedures and query plans. One self-designed language called "Llang" is used to make the creation of LLVM-IR easier. SQL-Semantics, Compile time, the degree of parallelism, never crash are key challenges that SAP HANA has to handle.

Query Interface

Custom API SQL

Apart from supporting SQL queries in SAP HANA, SQLScript, one self-designed scripting language similar to stored procedures, is also supported, which can make users write complex queries more conveniently in SAP HANA. In addition, MDX(multiple dimension eXpressions) is also supported in SAP HANA, which aims to connect different analytics applications such as financial plannings.

HANA Logo
Website

https://www.sap.com/products/hana.html

Tech Docs

https://www.sap.com/developer/topics/sap-hana.resources.html#resources

Developer

SAP SE

Country of Origin

DE

Start Year

2010

Former Name

SAP High-Performance Analytic Appliance

Project Type

Commercial

Written in

C++

Supported languages

C++, JavaScript, PL/SQL, R, SQL

Derived From

MaxDB, P*TIME

Operating Systems

Linux

Licenses

Proprietary