C-Store

C-Store is a column-oriented DBMS designed for read-optimized OLAP workloads. It adopts a column store architecture, explores various DSM compression schemes and corresponding query optimization strategies, stores data in overlapping collections of projections for both performance and availability, and employs other optimizations specific to column store. (Please delete anything in parenthesis as it is used to point out ambiguity)

History

C-Store is an academic project led by Michael Stonebraker and Daniel Abadi, involving people from Brown University, Brandeis University, MIT and the University of Massachusetts Boston. It was later commercialized into Vertica.

Concurrency Control

Two-Phase Locking (Deadlock Detection)

The system maintains a distributed lock table. Deadlock is resolved via timeouts by aborting one of the deadlocked transactions. (I think this means deadlock detection?)

Data Model

Relational

Logically, C-Store supports the standard relational data model, where a database contains a collection of tables and a table contains a collection of attributes.

Indexes

B+Tree BitMap

Despite the different possible encoding schemes of a column (e.g. RLE, bit-map encoding, or block-oriented delta encoding), they all use B-tree indexes. The system also stores join indices to stitch together all records in a table from its different columns (projections). Since a column which is ordered by another column in the same projection and contains few distinct values is encoded using bit-map encoding plus RLE, the paper also mentioned their extensive use of bitmap indexes.

Isolation Levels

Snapshot Isolation

(The paper talks in detail about their support for snapshot isolation, but does not mention if they support other isolation level ...)

Joins

Nested Loop Join Hash Join Sort-Merge Join

(found in their source code)

Logging

Logical Logging

"We use logical logging (as in ARIES), since physical logging would result in many log records, due to the nature of the data structures in WS." (But I believe ARIES log is physical logging? So confused ...)

Query Compilation

Not Supported

Query Interface

SQL

Logically, users interact with C-Store in SQL, with standard SQL semantics.

Storage Architecture

Disk-oriented

Each column is stored as a separate file containing a list of 64K blocks, each packing as many values as possible.

Storage Model

Decomposition Storage Model (Columnar)

As the name suggests, C-Store is all about column store ... Interestingly, both the read-optimized store component and the update/insert-oriented writable store component adopt the column store architecture.

Stored Procedures

Not Supported

(I don't think they mention it, so I guess it's a no?)

System Architecture

Shared-Nothing

The architecture was designed anticipating an environment of grid computers, containing large number of nodes each with private disk and memory. The data is horizontally partitioned across the disks of the nodes.

Views

Materialized Views

(found in their source code 'write store materialized view')

C-Store Logo
Website

http://db.csail.mit.edu/projects/cstore/

Developer

Massachusetts Institute of Technology, Brown University

Country of Origin

US

Start Year

2005

End Year

2008

Project Type

Academic, Open Source

Written in

C++

Supported languages

C++

Operating Systems

Linux

Licenses

BSD

Wikipedia

https://en.wikipedia.org/wiki/C-Store