AsterixDB

Viewing Revision #10 from 2026-06-02 15:56 View Current

AsterixDB is a DBMS that is highly parallel and, consequently, highly scalable. It has its own semi-structured data model similar to JSON/XML, as well as a custom query interface (SQL++) that incorporates much of SQL’s original functionality as well as new features. It partitions its data based on hashing and has a custom query execution engine, Apache Hyracks, to ensure that query plans take advantage of this partitioning and parallelize as much as possible.[05]

Logo Versions

Website: https://asterixdb.apache.org[01]
Source Code: https://github.com/apache/asterixdb[02] Accessed: Jul 26, 2026 Last Commit: Jul 23, 2026
Tech Docs: https://ci2.apache.org/projects/asterixdb/index.html[03]
Developers: University of California, Irvine
University of California, Riverside
Governance: Apache Software Foundation
Country of Origin: US
Start Year: 2009 [20]
Coding Agents: Claude [21]
Copilot [22]
Project Types: Academic, Open Source
Written in: Java
Operating System: All OS with Java VM
License: Apache v2
Twitter: @AsterixDB[04]

Database Entry

AsterixDB

Viewing Revision #10 from 2026-06-02 15:56 View Current

OLAP

History[05]

Initially, AsterixDB was developed as a collaborative effort between students, faculty, and staff at UC Riverside and UC Irvine. Noting that at the time, there was no open-source parallel DBMS available to the public, the team aimed to fill that hole with AsterixDB. It was inspired by modern developments in parallel DBMSs, semi-structured data manipulation, and Hadoop, and the team aimed to combine the best of all three to form AsterixDB. After its initial open-source release in 2013, it was developed by the same team as before until it was accepted by the Apache Software Foundation in February 2015, where it began to be developed at the Foundation.

Compression[06]

Naïve (Record-Level)

AsterixDB has two options for compression: leaving data uncompressed, or using the Google Snappy naive compression algorithm, which focuses on maximizing speed while keeping a reasonable compression ratio. At the moment, only primary indexes can be compressed.

Concurrency Control[07][08]

Two-Phase Locking (Deadlock Prevention) Two-Phase Locking (Deadlock Detection)

Locks are only required on primary indexes; for secondary indexes, accesses do not require locks, but primary index lookups verify the integrity of secondary index lookups. Furthermore, locking is only done for lookups, inserts, and deletes -- other operations such as flushing to disk do not acquire locks. Notably, AsterixDB only supports single-statement transactions, and some more complicated SQL++ statements are even represented as multiple single-object statements and as such are represented as multiple transactions.

Data Model[09]

Document / XML

Asterix uses its own data representation, the Asterix Data Model (ADM), which is laid out similar to a JSON object. It currently supports common primitive types (booleans, strings, ints of different sizes, floats/dobules, binary) as well as geometric data types (point, line, rectangle, circle, polygon), and time-based data types (date, time, timestamp, interval, duration). It supports null/missing values, as well as derived types (objects, arrays, and multisets).

Indexes[03][10]

B+Tree R-Tree Inverted Index (Full Text) Log-Structured Merge Tree

For primary indexes, AsterixDB uses Log-Structured Merge trees (LSM trees), and for secondary indexes, it allows B+trees, R trees, and inverted keyword indexes. However, these secondary indexes are “LSM-ified” to allow for LSM operations (inserting, searching, and merging components of the index) to be applied to them. This "LSM-ification" entails coupling the original in-memory secondary index with an in-memory B+tree called the deleted-key B+tree; while new updates and inserts go to the original in-memory index, deleted entries get recorded by storing their keys in the deleted-key B+tree.

Joins[11]

Nested Loop Join Hash Join Broadcast Join Index Nested Loop Join

Currently, AsterixDB supports multiple join types (hash, nested loop, and broadcast), but it does not yet have effective statistics or selectivity estimates. Therefore, the default join algorithm is a hash-join. For non-equality predicates such as inequalities or “like” predicates for strings where, the default is a nested-loop join, since hashing can only be used for exact matches.

Logging[08]

Logical Logging

AsterixDB uses logical logging at an index granularity, meaning each insert, delete, or update to an index generates a single log record. Logs have sequence numbers (LSNs) for the sake of the recovery phase. Unlike the ARIES protocol, where pages have page LSNs that indicate the most recent update applied to it, AsterixDB’s indexes have an index LSN, which indicates the most recent update applied to that particular index.

Query Interface[12][13][14]

Custom API

Initially, AsterixDB used one custom query language, the Asterix Query Language (AQL). This is laid out like JSON but has more flexibility, making it a superset of JSON. However, the most recent documentation lists AQL as deprecated. The currently supported query interface for AsterixDB is SQL++, which is a superset of both SQL and JSON, making it more flexible than AQL.

Storage Architecture[15][16]

Disk-oriented

AsterixDB's primary LSM tree indexes have an in-memory component and a disk-based component, and data is flushed to disk when the in-memory component becomes too full.

Storage Model[10]

N-ary Storage Model (Row/Record)

AsterixDB stores data in an NSM layout, hash-partitioning records to different nodes based on their primary keys.

Storage Organization[10]

Log-structured

AsterixDB uses a Log-Structured Merge tree (LSM tree) as its primary index. Secondary indexes are also "LSM-ified" to better fit with the system's overall storage organization.

Stored Procedures[17][18]

Supported

Users can define their own functions for AsterixDB in either SQL++ or Java.

System Architecture[19][10]

Shared-Nothing

AsterixDB is a shared-nothing parallel DBMS that uses hash-based partitioning to split data among various nodes. Queries are routed to a single cluster controller. This then connects to node controllers and metadata node controllers, which in turn connect to individual nodes.

Views[14]

Not Supported

Neither SQL++ nor AQL have commands to create, delete, or access views, and as AsterixDB only supports those two languages, it has no support for views.

Citations

22 sources

Apache AsterixDB apache.org Modified: 2026-02-25 Accessed: 2026-07-15
GitHub - apache/asterixdb: Mirror of Apache AsterixDB · GitHub github.com Accessed: 2026-06-19
https://ci2.apache.org/projects/asterixdb/index.html apache.org Dead — Check Archive Accessed: 2026-06-19
https://twitter.com/AsterixDB twitter.com
About Apache AsterixDB apache.org Modified: 2026-02-25 Accessed: 2026-06-07
Compression in AsterixDB - Apache AsterixDB - Apache Software Foundation apache.org Accessed: 2026-06-07
https://ci.apache.org/projects/asterixdb/sqlpp/primer-sqlpp.html#Transaction_Support apache.org Dead — Check Archive Accessed: 2026-05-26
https://asterix.ics.uci.edu/pub/vldb14-storage.pdf#page=5 uci.edu Modified: 2014-05-05 Accessed: 2026-06-07
https://ci.apache.org/projects/asterixdb/datamodel.html apache.org Dead — Check Archive Accessed: 2026-05-26
https://asterix.ics.uci.edu/pub/vldb14-storage.pdf uci.edu Modified: 2014-05-05 Accessed: 2026-06-07
https://ci.apache.org/projects/asterixdb/sqlpp/primer-sqlpp.html#Query_2-B_-_Index_join apache.org Dead — Check Archive Accessed: 2026-05-26
https://ci.apache.org/projects/asterixdb/aql/manual.html apache.org Dead — Check Archive Accessed: 2026-05-26
http://forward.ucsd.edu/sqlpp.html ucsd.edu Dead — Check Archive Accessed: 2026-05-26
https://ci.apache.org/projects/asterixdb/sqlpp/manual.html apache.org Dead — Check Archive Accessed: 2026-05-26
https://asterix.ics.uci.edu/pub/vldb14-storage.pdf#page=2 uci.edu Modified: 2014-05-05 Accessed: 2026-06-07
https://ci.apache.org/projects/asterixdb/ apache.org Dead — Check Archive Accessed: 2026-05-26
https://ci.apache.org/projects/asterixdb/udf.html apache.org Dead — Check Archive Accessed: 2026-05-26
https://ci.apache.org/projects/asterixdb/sqlpp/manual.html#Functions apache.org Dead — Check Archive Accessed: 2026-05-26
http://www.vldb.org/pvldb/vol7/p1905-alsubaiee.pdf#page=2 vldb.org Modified: 2014-08-27 Accessed: 2026-06-07
ASTERIX uci.edu Modified: 2022-04-08 Accessed: 2026-06-07
https://github.com/apache/asterixdb/commit/b248800041ec4174af5358c36803b21bb901d394 github.com Modified: 2026-05-28 Accessed: 2026-06-25
https://github.com/apache/asterixdb/commit/25e84088375ddabc746bb8e9f54c3eb154767fb6 github.com Modified: 2026-03-26 Accessed: 2026-06-25

Revision #10 Last Updated: 2026-06-02 11:56