DBDB.io The Encyclopedia of Database Systems · Est. 2017
Database of Databases

Database Entry

Aerospike


Aerospike is a distributed key-value DBMS. It is mainly targeted at OLTP workloads with large number of transactions. It is developed by a company of the same name.[05][06]

Source Code
https://github.com/aerospike/aerospike-server[02]
Country of Origin
US
Start Year
2009 [05]
Former Name
Citrusleaf
Coding Agent
Project Types
Commercial, Open Source
Written in
C
Supported Languages
C, C#, C++, Go, PHP, Python, Ruby, Rust
Operating System
Linux
License
AGPL v3

Database Entry

Aerospike


Aerospike is a distributed key-value DBMS. It is mainly targeted at OLTP workloads with large number of transactions. It is developed by a company of the same name.[05][06]

History[05][06][07][08]


Aerospike, originally known as Citrusleaf, was released in 2010 by a company called Aerospike. In 2012, the database was rebranded to Aerospike, to match the company name. In 2014, the database was open-sourced.

Checkpoints[09][10][11][08]


Checkpoints serve to backup nodes and can be invoked from a command line API. Checkpoints can be invoked at either the namespace or set level. These backups are formed by completing a scan of the entire namespace or set and writing the result to disk. This backup is restored during recovery. In the event, the node's content cannot be restored from disk it can be restored from another replica node.

Compression[12]


Aerospike supports LZ4, Snappy, and ZStandard compression. Compression is invoked when data is written to disk or SSD. Compression is only used to reduce size of data when kept on persistent storage. In-memory data is not compressed.

Concurrency Control[13][14][15][08][16][12]


Concurrency control is implemented by preventing deadlock from arising by only supporting transactions at a record level. Because of this, transactions can read a record while another transaction reads or writes to the record in most cases. Transactions can be blocked if the record in that a transaction is operating on may be updated from another node as part of recovery. When running in higher consistency levels, reads may block while there is a write to the record is being replicated. In these events, transactions are queued in a fixed-length queue. Transactions are aborted if the queue overflows.

Consistency guarantees set on a client level and can be further tuned and can be further refined by policies. Transactions are only allowed to read records, insert records--including multiple records in a single transaction, blind writes to records, delete records, and read-modify-write records. Each of these policies can be configured at the transaction level. This resolves the need for a higher level concurrency control mechanism.

Data Model[16][08][17]


Aerospike uses a key-value data model. Keys are mapped on to records. Each record is comprised of its key, its bins, and its metadata. Bins are analogous to fields in a relational database. Records are organized into sets. Sets are organized into namespaces. Namespaces are able to have storage policies that can be configured. Policies dictate whether records are stored on disk or in-memory and replication factors among other parameters.

Aerospike added support for document data types in 2022 and vector data storage in 2023.

Foreign Keys[18]


Aerospike does not support foreign keys but instead allows users to embed sub-records within records.

Indexes[06][16]


Aerospike implements indexes differently for primary and secondary keys. Primary indexes are implemented as an in-memory mix of red-black trees and traditional hash-indexes. Consistent hashing allows records to be located to a particular node. Within a single node, records are indexed by red-black trees that it calls sprigs. Secondary indexes are built using in-memory b-trees.

Isolation Levels[08][16]


Aerospike supports multiple isolation levels: strong consistency, linearizability (which is serializable at a record level), and session consistency. In strong consistency, the database ensures that all writes occur in a specific order across all nodes. Strong consistency is only supported on a single record level. Under linearizable consistency, reads and writes appear to be atomic system-wide. However, the additional synchronization costs associated with this can impact performance.

Joins[19]


Logging[10][20][08]


Logging is not supported for recovery. Instead, after a crash data is restored from local storage (if present), restored from replicas if available or is restored from the most recent backup. There is a configuration option that allows users to force restoration from local storage.

Parallel Execution[21][08]


Aerospike allows users to allocate threads to different types of tasks. Scans and secondary index lookups are able to configure a desired number of tuples per second and the system will try to prioritize different operations accordingly. Aerospike supports the usage of multiple threads for large scans.

Query Execution


Query Interface[08]


Aerospike is able to be accessed from client-side drivers. Each driver is designed to follow a standardized set of operations, but the APIs are implemented in the different languages of different client-side drivers.

Storage Architecture[08]


Policies are able to be configured to store data and indexes in either memory, shared memory, or disk.

Storage Model[16][08]


Aerospike uses a hierarchical storage model. The highest level of the storage hierarchy is called a namespace. Configuration of data storage is done at the namespace level, including the configuration of where data is stored (Memory or Disk) and replication factor. These controls are done through namespace level policies which can be user-defined. Below namespaces lie sets that are akin to tables in a relational database which are comprised of a number of records. Policies can be defined for sets to override the policy of their namespace. Records are all stored contiguously in memory or on disk.

Storage Organization[16]


Aerospike uses copy on write storage organization. During transactions, only pages that contain modified data are duplicated to try to reduce the amount of duplicated data that is stored in memory. Because storage policies are configurable at the namespace level, different namespaces can be stored on different storage mediums (memory, flash, disk, etc.).

Stored Procedures[22]


Aerospike supports User-Defined Functions (UDFs) which are limited versions of stored procedures. UDFs are able to be invoked on single records when called from a client or can be invoked on a stream of records. User-Defined Functions can be written in Lua which will have access to APIs for whether the UDF is to be invoked on a single record or on a stream of records. UDFs are deployed to all nodes from the primary (which Aerospike calls a principal) so that the same version of UDF is running on every node.

System Architecture[16]


Aerospike uses a consistent hashing method to distribute the storage of data across nodes. Distribution is done at a namespace level (see storage model) with records within sets being distributed across nodes. Records are replicated across nodes using the same consistent hashing method and are not replicated on the same node as their original storage location.

Views


Citations

23 sources
  1. Aerospike | Aerospike aerospike.com
  2. GitHub - aerospike/aerospike-server: Aerospike Database Server – flash-optimized, in-memory, nosql database · GitHub github.com
  3. Aerospike Documentation | Aerospike Documentation aerospike.com
  4. Aerospike (database) - Wikipedia wikipedia.org
  5. Aerospike, the former Citrusleaf | DBMS 2 : DataBase Management System Services dbms2.com Dead — Check Archive
  6. Introduction to Citrusleaf | DBMS 2 : DataBase Management System Services dbms2.com
  7. Aerospike: Thanks for that $20m, VCs ... next we'll OPEN SOURCE our NoSQL database theregister.com
  8. https://discuss.aerospike.com/t/re-general-questions-about-aerospike/6923/3 aerospike.com Dead — Check Archive
  9. Legacy Aerospike backup tool (asbackup) | Aerospike Documentation aerospike.com
  10. Backup and restore overview | Aerospike Documentation aerospike.com
  11. Legacy Aerospike restore tool (asrestore) | Aerospike Documentation aerospike.com
  12. Planning | Aerospike Documentation aerospike.com
  13. https://aerospike.com/docs/client/java/usage/kvs/write.html aerospike.com Dead — Check Archive
  14. https://discuss.aerospike.com/t/faq-what-are-the-theories-for-tsvc-timeout/5265 aerospike.com Dead — Check Archive
  15. https://discuss.aerospike.com/t/hot-key-error-code-14/986 aerospike.com Dead — Check Archive
  16. Architecture overview | Aerospike Documentation aerospike.com
  17. Distributed NoSQL database Aerospike adds support for JSON theregister.com
  18. https://aerospike.com/blog/embedding-linking-denormalization aerospike.com Dead — Check Archive
  19. Simulate a join - How Developers Are Using Aerospike / Aerospike and other Databases - Aerospike Community Forum aerospike.com Dead — Check Archive
  20. Backup and Recovery in AWS | Aerospike Documentation aerospike.com
  21. Configuration reference | Aerospike Documentation aerospike.com
  22. Architecture overview | Aerospike Documentation aerospike.com Dead — Check Archive
  23. https://github.com/aerospike/aerospike-server/commit/62ed0354f3b44f288050333024780d3aa3e277b4 github.com
Revision #21 Last Updated: