RavenDB

RavenDB is a full-transactional NoSQL ACID-compliant database and is developed by Hibernating Rhinos. RavenDB is designed to operate and be deployed in a distributed cluster, combining both on-premise and cloud deployments. RavenDB also provides auto-indexing capabilities alongside full-text search features to speed up queries. RavenDB provides a range of clients outside of the box, storage-layer encryption, HTTP network support, and the Raven Query Language (with similarities to SQL) for database queries.

History

RavenDB was originally started as primarily an independent project by Ayende@Rahien and was originally named Rhino Divan DB before becoming renamed eventually to RavenDB. RavenDB originated from design made by Ayende after his reading of CouchDB source code and was focused on providing a REST-based document store for the .NET ecosystem.

System Architecture

Shared-Nothing

RavenDB does not explicitly state the distributed architecture the system is built for. However, the documentation does imply shared-nothing due to the notion of distinct nodes and replication between nodes over the network. In a clustered environment, each RavenDB node is a full-fledged node using Rachis (based off of Raft) for consensus with master to master replication with no support for sharding. As such, RavenDB was likely intended primarily to be used as a shared-nothing architecture, as indicated by the primary developer.

Storage Model

Custom

Indexes and raw data are stored in separate .voron files, which are individually mapped into main memory. Raw data blocks are operated on in terms of 8 KB pages (with various optimizations employed for performance such as prefetching) that support random I/O operations. RavenDB stores documents larger than 8KB using consecutive 8KB pages that are logically treated by the rest of the system as a single page. RavenDB also ensures that attachments, which are binary additions to the normal JSON data, carry the same ACID semantics as regular transactional documents do and are guaranteed to be stored on the same storage as the containing documents.

Query Interface

Custom API HTTP / REST

RavenDB’s query language is RQL which is a language with resemblance to SQL but designed specifically for RavenDB. As RavenDB utilizes REST over HTTP for communication between client and server nodes, queries are transmitted using HTTP calls, which can exploit HTTP cache semantics for document loading. The query interface allows the user to specify use of a specific static index (an index created explicitly by the user) or allow the query optimizer to search for the best index, optionally creating one and persisting the new index while dropping old indexes that are covered.

Foreign Keys

Not Supported

RavenDB does not have foreign keys or constraints resembling foreign keys. Although RavenDB Includes resembles the concept of foreign keys, RavenDB makes very explicit that Includes is not synonymous with a foreign key. RavenDB is built with the design that internal consistency should be maintained inherently by the document and any external references should not be constrained in lifetime.

Joins

Not Supported

RavenDB does not support out-of-the-box joins. Although they support the Includes operation, the Includes operation is not actually a join - the operation simply fetches the related document. Although RavenDB does not provide out-of-the-box joins, joins can be emulated either by grouping related data together during the indexing process or using RavenDB’s projection support that was added in RavenDB 4.x.

Indexes

B+Tree Inverted Index (Full Text)

RavenDB indexes are either variable size key and values or fixed sized B+tree where keys are Int64 and values are fixed at creation time. The description of the indexes is provided by RavenDB’s custom storage engine Voron. As far as the documentation states, a table is composed of raw data sections and indexes which are under the hood regular or fixed-size B+Trees. To handle full text search, RavenDB couples Lucene’s indexing engine (which provides the inverted index for full-text search) with an additional Voron layer to provide ACID guarantees.

Isolation Levels

Serializable Snapshot Isolation

RavenDB supports two isolation levels, depending on whether the transaction is a cluster-wide transaction or not. Generally for a single transaction, all the operations are executed at the snapshot isolation level, with all document state retrieved as it was at the beginning of the HTTP request containing all requested operations. At the cluster-wide concurrent transaction level, the transactions are at the serializable isolation level (appearing that each transaction ran in a sequential order); however, if cluster-wide and non-cluster run concurrently, the cluster-wide will take precedence over the modifications of the non-cluster transaction.

Query Compilation

Not Supported

RavenDB does not utilize code generation of any form. Upon close inspection of the query execution pipeline from RavenDB’s raw source code, it is evident that RavenDB executes queries in an iterative manner by picking the correct index, creating an index if necessary, and then doing a lookup into the correct index.

Storage Architecture

Disk-oriented

RavenDB’s storage engine Voron stores all data files on the, which basically includes all raw data blocks, data block indexes, and extra indexes. Voron utilizes memory mapped file techniques in order to bring disk-resident data into main memory.

Concurrency Control

Timestamp Ordering

By default, RavenDB utilizes the concurrency control scheme “Last Writer Wins”, with RavenDB executing all conflicting write operations with the last one winning. In addition, RavenDB provides optimistic concurrency support at three different levels: DocumentStore basis, per-session basis, and per-operation basis. RavenDB's optimistic concurrency scheme does not check for conflicts on data that is only read by the transaction. Instead, the scheme only checks for conflicting writes and ensures ordering of transactions with regards to writes.

RavenDB does not handle distributed transactions, rather transactions are executed against a single node then replicated to other nodes as a separate batch transaction, with possible application of relevant conflict-resolution policies.

Stored Procedures

Not Supported

RavenDB does not support stored procedures by the definition of most relational DBMS (i.e SQL Server, Oracle). However, since RavenDB queries rely solely on indexes as opposed to sequential scans, it is possible to emulate query-type stored procedures with a two-step process: (1) Create RavenDB indexes that leverage projections, which could evaluate expressions, retrieve other data, and execute raw javascript/C# code amongst others; (2) Execute a RavenDB query that relies on the newly created index.

Views

Virtual Views Materialized Views

RavenDB supports materialized views. A common technique would be creating a RavenDB index that can transform various documents into another document (the view) which can then be quickly queried at a later time. The technique presented by RavenDB effectively produces a materialized view since the indexing process will generate and persist the new document at indexing time rather than generate and transmit to the client on-demand without disk persistence.

RavenDB also supports the idea of virtual views. RavenDB supports the creation of a virtual view through the use of a server-side projection, a function which the client must be aware of when making the initial query request. Since the server-side projection transformers create these these new documents (the view) on-demand, it can be thought of as a virtual view.

As a single view object can essentially be treated as a document, RavenDB supports writing and persisting view objects. In addition, RavenDB provides the Changes API (similar to triggers) to assist with updating stale view documents automatically when their underlying documents change.

Storage Organization

Indexed Sequential Access Method (ISAM)

JSON documents are stored in raw data blocks on disk by RavenDB’s Voron storage manager. Each raw data block on disk is associated with an identifier. Voron maintains internal “Voron indexes” which is not controlled by the user to quickly map from given identifiers to the actual raw data block. The Voron indexes allow for O(1) access to specific raw data blocks rather than requiring either scans or trying to exploit sorted properties.

Logging

Physical Logging

RavenDB logs transaction data to a Write Ahead Journal (equivalent to WAL). RavenDB employs physical logging where all modifications made by a transaction are written to the journal file using unbuffered I/O with write-through. RavenDB also provides optional encryption to the logged data for security purposes.

Query Execution

Materialized Model

RavenDB’s query execution is purely index-based, either the execution engine finds an index that can answer the query or it creates an index that can answer the query. As such, all relevant documents are pulled into memory at the time relevant information is retrieved from the index. This also allows for server-side projections (transformers) to operate on all the visible document data, mutating it as necessary.

As a performance speedup, RavenDB also supports the notion of streaming, whereby a single result is written out to the network stream as results are found and enumerated through during the index lookup. In an extended client-server interaction, this can give semblance to a document-at-a-time processing model where an individual document is pushed from the server to the client for processing as they become available, rather than waiting for the entire collection.

Compression

Naïve (Page-Level)

Documents are stored compressed, however the specific compression algorithm used is not discussed. RavenDB versions 3.x utilized naive compression where the entire document is stored in a compressed form. However, in RavenDB 4.0 and onwards, individual fields within each document are selectively compressed, thus allowing the engine to still be able to work with certain parts of the document without necessarily needing to first decompress the document.

Data Model

Key/Value Document / XML

RavenDB is primarily a document based database, with the strict requirement that documents be formatted as JSON. However, RavenDB can also be utilized as a key/values using the Compare Exchange interface to perform cluster-wide interlocked distributed operations for updating key/value pairs.

RavenDB Logo
Website

http://ravendb.net/

Source Code

https://github.com/ravendb/ravendb

Tech Docs

https://ravendb.net/docs/article-page/4.0/csharp

Developer

Hibernating Rhinos

Country of Origin

IL

Project Type

Commercial, Open Source

Written in

C#

Operating Systems

Linux, OS X, Windows