SciDB

History

SciDB emerged from the [Extremely Large Data Base (XLDB) Conference] (https://www.xldb.org/about/) first hosted in 2007. The conference was organized by the [SLAC National Accelerator Laboratory’s] (https://www6.slac.stanford.edu/) Scalable Data Systems team to address the gap between current database systems and the needs of data-intensive scientific projects such as the [Large Synoptic Survey Telescope (LSST) astronomical survey] (https://lsst.slac.stanford.edu/). Dave Dewitt and Mike Stonebraker agreed to lead the development of a database that would fulfill the needs of these projects. A SciDB workshop was hosted at the second XLDB conference in 2008 and code development began the same year. In 2009, Mike Stonebraker and Marilyn Matz co-founded [Paradigm4] (https://www.paradigm4.com/). Paradigm4’s team developed SciDB into a robust commercial software product and continue to develop and improve the two offered versions of SciDB: an open-source, Community Edition, and a proprietary, Enterprise Edition that offers additional functionality and customer-specific solutions.

Storage Architecture

Disk-oriented

Query Interface

Custom API

Storage Model

Custom

Compression

Run-Length Encoding Null Suppression

SciDB allows users to define how each attribute of an array will be compressed when the array is created. The default is no compression. The additional options are zlib, bzlib, or null filter (null suppression) compression. Since SciDB stores data by attribute, vertically partitioning logical chunks of an array into single-attribute physical chunks, the specified compression is used on a chunk-by-chunk basis. If certain parts of a chunk are accessed more often than others, causing overhead due to decompression and recompression, SciDB can partition a chunk into tiles and compress on a tile-by-tile basis. Run-length encoding is used to compress recurring sequences of data. In addition, SciDB’s storage manager compression engine can split or group logical chunks in order to optimize memory usage while remaining within the limit of the buffer pool’s fixed-size slots.

Data Model

Array / Matrix

System Architecture

Shared-Nothing

Stored Procedures

Supported

SciDB Logo
Website

http://scidb.org/

Source Code

https://forum.paradigm4.com/t/index-of-scidb-releases/773

Developer

Paradigm4

Country of Origin

US

Start Year

2008

Project Type

Academic, Commercial, Open Source

Written in

C++

Supported languages

C++, Python, R

Embeds / Uses

PostgreSQL, RocksDB

Operating Systems

Linux

Licenses

AGPL v3, Proprietary

Wikipedia

https://en.wikipedia.org/wiki/SciDB