Cosmos DB

CosmosDB is a globally distributed, consistent, schema-less, multi-model document database that provides high-throughput and availability across various geographical regions. It is used to solve data storage problems of large-scale distributed Internet-scale applications. Most of the Microsoft internal services such as Bing, Office 365, Ads, etc. and many other external services use Cosmos DB for their storage needs. It provides 99.99% availability regardless of a number of regions associated with data.

History

It started as an internal project called "Florence" at Microsoft and was later named as Document DB in 2014. It was released as an Azure service called Azure Cosmos DB in 2017.

Storage Model

Hybrid

Indexes

Bw-Tree

Cosmos DB is a NoSQL document database which performs Indexing directly on document's contents. The index is a union of all documents words and can be queried on any word of any document present in the database. It is represented as a schema-agnostic tree, where the tree nodes are all possible words of document set and values are the associated documents in which the word is present. To represent this schema-less index, Bw-Tree is used. To support fast random writes on SSDs and Disks, Cosmos DB also employs Log-Structured merge trees, to store Bw-Tree modifications. It uses delta-record updates instead of in-place updates in the tree to avoid cache invalidation and write amplification on SSDs. Cosmos DB supports blind-incremental updates to its Bw-Tree, so as to perform partial writes to any record without reading it to the memory.\

As the database is distributed, Index modifications have to be replicated to all of the replicas of a data shard. Cosmos DB performs asynchronous replication of delta-records to make secondaries consistent with the primary replica. When a new document is created on the primary, it is completely analyzed to extract all of the words and these words are inserted into the Index, while also transferring the word stream to the secondaries.

Checkpoints

Non-Blocking

It performs checkpoints on its document Index (Bw-Tree) periodically to reduce the recovery time if a node fails.

System Architecture

Shared-Nothing

Cosmos DB service is deployed on several replicated shared-nothing nodes across geographical regions for high-availability, low-latency, and high throughput. Some or all of these distributed nodes form a replica set for serving requests on a data shard. Among the nodes, one of them is elected as a master to perform total-ordered writes on the data shard. Writes are done on the write-quorum (W), a subset of the replica nodes, to ensure that the data is durable. Reads are performed on read-quorum (R), a subset of replica nodes, to get the desired consistency levels (Strong, Bounded-staleness, Session, Consistent Prefix, Eventual) as configured by users.

Concurrency Control

Multi-version Concurrency Control (MVCC)

Cosmos DB is a service that runs on a massive scale distributed system on a large cluster of servers over multiple distributed geographic regions. To make sure that meta-data is consistent across the regions, Cosmos DB uses a Paxos like protocol.

Stored Procedures

Not Supported

Storage Architecture

Disk-oriented

Cosmos DB Logo
Website

https://azure.microsoft.com/en-us/services/cosmos-db/

Tech Docs

https://docs.microsoft.com/en-us/azure/cosmos-db/

Developer

Microsoft

Country of Origin

US

Start Year

2015

Former Name

DocumentDB

Project Type

Commercial

Written in

C++

Supported languages

C#, Go, Java, JavaScript, Python

Operating Systems

Windows

Licenses

Proprietary