OrientDB is a multi-model NoSQL DBMS that supports of graph, documents, key-value, and object-oriented storage. Instead of just implementing another layer with an API, OrientDB integrates those models. It also supports both disk-oriented and in-memory storages. Moreover, OrientDB supports SQL syntax with few differences from standard SQL and extends SQL to support complex graph concepts. It is also an ACID compliant DBMS and able to handle transactional workloads. OrientDB supports a multi-master distributed architecture.
OrientDB was originally developed by Luca Garulli in 2010. Luca rewrote the fast persistent layer of OrientDB ODBMS in Java as OrientDB. Starting from 2012, OrientDB is sponsored by OrientDB LTD, whose founder and CEO is Luca. OrientDB LTD is a for-profit company, whose former is called Orient Technologies LTD. Andrey Lomakin redeveloped the storage engine of OrientDB, called plocal, from 2012 to 2014. In 2013, Andrey joined the company as the co-owner and the leader of R&D department of OrientDB LTD. On Sep. 19 2017, Callidus Software Inc. (NASDAQ:CALD), doing business as CallidusCloud acquired OrientDB LTD.
N-ary Storage Model (Row/Record)
OrientDB uses page as a basic unit to store records. It is essentially N-ary storage model. Records are usually stored in two kinds of pages. The first kind of pages store metadata about records including RIDs and pointers to the actual content. Each entry has fixed size. The other kind of pages store actual content of records. Each record is store as key/value pairs.
Key/Value Document / XML Graph Object-Oriented
OrientDB is a multi-model DBMS. It supports graph, document, key-value and object-oriented models. It combines all the features of the four models into the core engine rather than just implement an additional layer of APIs to support various models. The graph model represents a network structures including vertices representing entities and edges showing connections among vertices. Besides mandatory properties to define vertices and edges, OrientDB allows user-defined properties for both vertices and edges, which make them like documents. For document model, OrientDB introduces the concept "LINK" as the relationship among documents. Hence, when users refer a document, all "LINK"s will be automatically resolved by OrientDB instead of done by developers in common document DBMS. The key-value model is simplest among all four models. OrientDB organizes key-value pairs similar to common key-value models. The difference is that OrientDB supports richer types of values: it allows graph elements and documents as values. The object-oriented model is derived from the concept of object-oriented programming. OrientDB directly uses concepts in object-oriented programming to define records. It supports inheritance and polymorphism.
Tuple-at-a-Time Model Vectorized Model
OrientDB is originally designed to use iterator model. However, OrientDB allows some fetching strategies to use vectorized model. Some components in execution plans pre-fetch records in a single call and then do batch processing. This pattern can be considered as vectorized model.
OrientDB supports multi-master shared-nothing distributed architecture. OrientDB uses the Hazelcast Open Source project in its distributed architecture. It integrates Hazelcast to maintain the lifecycle of every nodes in the distributed system. OrientDB also uses Hazelcast plugin for distributed configuration.
Read Committed Repeatable Read
OrientDB supports two isolation levels: Read Committed and Repeatable Reads. The default isolation level is Read Committed. Read Committed is the only available isolation level when transactions are performed on remote databases. Repeatable Reads is allowed only when transactions are perform on local databases and consumes more memory than Read Committed. Users can change the isolation level using Java API.
OrientDB supports record-level compression. The compression includes two types of algorithms: gzip and snappy. The default is no compression. Users can set compression choices using SQL syntax or in the configuration of storage engine. Users can also define custom compression algorithms. The records will be decompressed when they are loaded from the storage engine.
OrientDB supports materialized views in the latest version. It uses the SQL syntax to create or drop views. Materialized views can be configured to read-only or updatable. The default is read-only. Users can define update interval to update views every certain period. Also, users can manually modify views and the modification will be reflected in corresponding records. Updatable views cannot be created from aggregation.
OrientDB supports full checkpointing. It is a simple disk cache flush, which means it flushes all the content in disk cache to the disk. It can be invoked when cluster is added to storage, cluster changes or the storage closes. Users can set time stamps to perform full checkpointing in those scenarios during the configuration of storage engine.
B+Tree Hash Table Inverted Index (Full Text)
OrientDB supports five index algorithms, which belong to three categories. Moreover, OrientDB allows users to define custom index engines by asking them to implement specific classes. SB-Tree index The SB-tree index is a variant of B-tree index with optimizations focusing on data insertion and long range queries. It is the default index type of OrientDB. Hash index OrientDB supports two hash index algorithms, regular hash index and auto sharding index, an implementation of distributed hash table based on Murmur3 hash function. Both index applies extendible hashing algorithm and do not support range queries. Lucene engine Apache Lucene Core is an implementation of inverted index. OrientDB provides full-text and spatial index using Lucene engine. OrientDB uses SQL syntax to manage indexes using a specific prefix representing indexes. OrientDB has two methods to update indexes, automatic and manual. The default is manual. When creating the index, users should specify the type of indexing and the relevant classes. If users would like to use automatic method for updating indexes, they also need to explicitly specify that when creating indexes.
Multi-version Concurrency Control (MVCC)
OrientDB applies Multi-version Concurrency Control and checks the integrity on commit. It is optimistic and OrientDB does not support pessimistic transactions. When a transaction has a conflict with another, OrientDB will throw an exception and the application can determine whether to abort it or not. With Graph, OrientDB provides three consistency mode. The first mode, which is default, will maintain consistency using transactions while the other two does not use transactions. They replies on a database repair operation. One runs the repair operation synchronously to the application, but the other runs the repair operation asynchronously to the application.
The query execution planner in OrientDB generates execution plan consisting of pre-defined query steps, which are components written in Java. Thus, OrientDB uses common JVM JIT compilation. Besides, query execution plans are cached to avoid recalculating execution plans for the same query.
SQL Stored Procedures GraphQL Gremlin HTTP / REST
OrientDB uses SQL as its query language and has some extensions to support graph functionality. However, the syntax has some differences from the standard SQL syntax. For example, it does not support joins or HAVING keyword. OrientDB also has its own concept similar to stored procedures of RDBMS.
https://github.com/orientechnologies/orientdb
OrientDB LTD
2010
CallidusCloud
C, C#, C++, Clojure, Elixir, Go, Groovy, Java, JavaScript, Perl, PHP, Python, Ruby, Scala