CouchDB

CouchDB ("cluster of unreliable commodity hardware") is a document-oriented NoSQL DBMS.

Storage Organization

Copy-on-Write / Shadow Paging

CouchDB implements append-only B+Tree and uses copy-on-write method to update the database file as well as the index.

Concurrency Control

Multi-version Concurrency Control (MVCC)

The CouchDB uses MVCC as the concurrency control policy for read operations, each clients sees a consistent snapshot during the read operation.

System Architecture

Shared-Nothing

CouchDB is a peer-based distributed database system, each peer can provide same data for user to access. They do not share anything.

Stored Procedures

Not Supported

Indexes

B+Tree

The documents in CouchDB are indexed by their name and sequence id, these index are organized by B-trees.

Storage Model

Custom

As the CouchDB is append-only, the critical header of the database file is in the tail of the file, which will be access/re-append by each append operation.

The values in the body of a file header is:

8 bits -- File format version (Currently 10) 48 bits -- Update sequence number counter. This is the sequence number that will appear in the by-sequence index for the next update. 48 bits -- Purge sequence number. 48 bits -- Purged documents pointer 16 bits -- Size of by-sequence B-tree root 16 bits -- Size of by-ID B-tree root 16 bits -- Size of local documents B-tree root The B-tree roots, in the order of the sizes, are B-tree node pointers as described in the "Node Pointers" section.

To locate the file header, the database file are organized as 4096-byte file blocks.

The data in the file are organized as variable-length chunks.

Query Interface

HTTP / REST

CouchDB provide RESTful HTTP API for reading and updating database documents.

Storage Architecture

Disk-oriented

CouchDB will store data on disk and all update are synchronously flushed to disk.

Joins

Not Supported

The data in CouchDB are store as documents, which is unnecessary for joins operations. The way to replace join operation is to do denormalization or stored with related data in documents.

Checkpoints

Non-Blocking

In CouchDB, any changes to a document simply appends a new record to the database file, it is always non-blocking to take a snapshot of the file system to get the latest version of the database.

Isolation Levels

Snapshot Isolation

In CouchDB, a read request will always see the most recent snapshot of the database at the time of the beginning of the request because of MVCC.

Compression

Naïve (Record-Level)

CouchDB does compaction operation to reduce the disk usage similar like the vacuum in SQLite. The number of stored revisions (and their tombstones) can be configured by using the _revs_limit URL endpoint. The compaction operations can either be manually triggered or automatically.

Logging

Shadow Paging

CouchDB uses shadow paging as its logging method, it only does appending operations to the current database file, which provides the MVCC features.

CouchDB Logo
Website

http://couchdb.apache.org/

Source Code

https://github.com/apache/couchdb

Tech Docs

http://docs.couchdb.org/en/stable/index.html

Developer

Damien Katz

Country of Origin

US

Start Year

2005

Acquired By

Apache Software Foundation

Project Type

Open Source

Written in

Erlang

Supported languages

C, C#, Erlang, Haskell, Java, JavaScript, Lisp, Lua, Objective-C, Ocaml, Perl, PHP, PL/SQL, Python, Ruby, Smalltalk

Operating Systems

Android, BSD, iOS, Linux, OS X, Solaris, Windows

Licenses

Apache v2

Wikipedia

https://en.wikipedia.org/wiki/Apache_CouchDB