- Source Code
- https://github.com/apache/asterixdb[02]
- Developer
- Country of Origin
- US
- Start Year
- 2009 [18]
- Project Types
- Academic, Open Source
- Written in
- Java
- Operating System
- All OS with Java VM
- License
- Apache v2
Compression[03]
AsterixDB has two options for compression: leaving data uncompressed, or using the Google Snappy naive compression algorithm, which focuses on maximizing speed while keeping a reasonable compression ratio. At the moment, only primary indexes can be compressed.
Concurrency Control[04][05]
Locks are only required on primary indexes; for secondary indexes, accesses do not require locks, but primary index lookups verify the integrity of secondary index lookups. Furthermore, locking is only done for lookups, inserts, and deletes -- other operations such as flushing to disk do not acquire locks. Notably, AsterixDB only supports single-statement transactions, and some more complicated SQL++ statements are even represented as multiple single-object statements and as such are represented as multiple transactions.
Data Model[06]
Asterix uses its own data representation, the Asterix Data Model (ADM). It is laid out similar to a JSON object but has more flexibility, making it a superset of JSON. It currently supports common primitive types (booleans, strings, ints of different sizes, floats/dobules, binary) as well as geometric data types (point, line, rectangle, circle, polygon), and time-based data types (date, time, timestamp, interval, duration). It supports null/missing values, as well as derived types (objects, arrays, and multisets).
Indexes[07][08]
For primary indexes, AsterixDB uses Log-Structured Merge trees (LSM trees), and for secondary indexes, it allows B+trees, R trees, and inverted keyword indexes. However, these secondary indexes are “LSM-ified” to make their properties more like those of an LSM tree.
Joins[09]
Currently, AsterixDB supports multiple join types (hash, nested loop, and broadcast), but it does not yet have effective statistics or selectivity estimates. Therefore, the default join algorithm is a hash-join. For non-equality predicates such as inequalities or “like” predicates for strings where, the default is a nested-loop join, since hashing can only be used for exact matches.
Logging[05]
AsterixDB uses logical logging at an index granularity, meaning each insert, delete, or update to an index generates a single log record. Logs have sequence numbers (LSNs) for the sake of the recovery phase. Unlike the ARIES protocol, where pages have page LSNs that indicate the most recent update applied to it, AsterixDB’s indexes have an index LSN, which indicates the most recent update applied to that particular index.
Query Interface[10][11][12]
Initially, AsterixDB used one custom query language, the Asterix Query Language (AQL). This is laid out like JSON but has more flexibility, making it a superset of JSON. However, the most recent documentation lists AQL as deprecated. The currently supported query interface for AsterixDB is SQL++, which is a superset of both SQL and JSON, making it more flexible than AQL.
Storage Architecture[13][14]
Since AsterixDB is a Big Data Management System, its primary LSM tree indexes have an in-memory component and a disk-based component, and data is flushed to disk when the in-memory component becomes too full.
Storage Model[08]
AsterixDB stores data in an NSM layout, hash-partitioning records to different nodes based on their primary keys.
Storage Organization[08]
AsterixDB uses a Log-Structured Merge tree (LSM tree) as its primary index. Secondary indexes are also "LSM-ified" to better fit with the system's overall storage organization.
Stored Procedures[15][16]
Users can define their own functions for AsterixDB in either SQL++ or Java.
System Architecture[17][08]
AsterixDB is a shared-nothing parallel DBMS that uses hash-based partitioning to split data among various nodes. Queries are routed to a single cluster controller. This then connects to node controllers and metadata node controllers, which in turn connect to individual nodes.
Views[12]
Neither SQL++ nor AQL have commands to create, delete, or access views, and as AsterixDB only supports those two languages, it has no support for views.
Citations
18 sources- Apache AsterixDB apache.org
- GitHub - apache/asterixdb: Mirror of Apache AsterixDB · GitHub github.com
- Compression in AsterixDB - Apache AsterixDB - Apache Software Foundation apache.org
- https://ci.apache.org/projects/asterixdb/sqlpp/primer-sqlpp.html#Transaction_Support apache.org
- https://asterix.ics.uci.edu/pub/vldb14-storage.pdf#page=5 uci.edu
- https://ci.apache.org/projects/asterixdb/datamodel.html apache.org
- https://ci2.apache.org/projects/asterixdb/index.html apache.org
- https://asterix.ics.uci.edu/pub/vldb14-storage.pdf uci.edu
- https://ci.apache.org/projects/asterixdb/sqlpp/primer-sqlpp.html#Query_2-B_-_Index_join apache.org
- https://ci.apache.org/projects/asterixdb/aql/manual.html apache.org
- http://forward.ucsd.edu/sqlpp.html ucsd.edu
- https://ci.apache.org/projects/asterixdb/sqlpp/manual.html apache.org
- https://asterix.ics.uci.edu/pub/vldb14-storage.pdf#page=2 uci.edu
- https://ci.apache.org/projects/asterixdb/ apache.org
- https://ci.apache.org/projects/asterixdb/udf.html apache.org
- https://ci.apache.org/projects/asterixdb/sqlpp/manual.html#Functions apache.org
- http://www.vldb.org/pvldb/vol7/p1905-alsubaiee.pdf#page=2 vldb.org
- ASTERIX uci.edu