TileDB is a storage engine designed to support the storage and access of both dense and sparse multi-dimensional arrays. The key idea of TileDB is that it stores array elements into collections called fragments, which can be either dense or sparse. Each of these fragments stores data in data tiles. In the case of dense fragments, the capacity of data tiles is limited by a fixed chunk size. In the case of sparse fragments, the capacity of data tiles is limited by a fixed element size. TileDB also supports parallel I/O and is completely multi-threaded.[04][05]
- Website
- https://www.tiledb.com/[01]
- Source Code
- https://github.com/TileDB-Inc/TileDB[02]
- Tech Docs
- https://docs.tiledb.com/main[03]
- @tiledb
- Developers
- Country of Origin
- US
- Project Types
- Commercial, Open Source
- Written in
- C++
- Inspired By
- SciDB
- License
- MIT License
TileDB is designed to store many different types of data, such as genomic data, machine learning model parameters, imaging data, and LiDaR data.
TileDB is a storage engine designed to support the storage and access of both dense and sparse multi-dimensional arrays. The key idea of TileDB is that it stores array elements into collections called fragments, which can be either dense or sparse. Each of these fragments stores data in data tiles. In the case of dense fragments, the capacity of data tiles is limited by a fixed chunk size. In the case of sparse fragments, the capacity of data tiles is limited by a fixed element size. TileDB also supports parallel I/O and is completely multi-threaded.
TileDB is designed to store many different types of data, such as genomic data, machine learning model parameters, imaging data, and LiDaR data.[04][05]
History[06]
TileDB was invented at the Intel Science and Technology Center for Big Data. The research center was a collaboration between Intel Labs and MIT. The research project was published in a VLDB 2017 paper. TileDB, Inc. was founded in February 2017 to further develop and maintain the DBMS.
Compression[05]
TileDB supports the following compressors: GZIP, Zstandard, LZ4, RLE, Bzip2, and Double-delta. Double-delta is a compressor created for TileDB, and is similar to Facebook's Gorilla system.
Concurrency Control[07]
TileDB provides no transactional support in the current version. It only guarantees atomic reads and writes. TileDB allows users to build a transactional manager on top for concurrency control.
Data Model[08][05]
TileDB uses a multi-dimensional array format that handles both sparse data and dense data. An array is composed of fragments, where each fragment is an array snapshot containing cells written in that write operation. Fragments can be categorised into sparse fragments and dense fragments. Sparse fragments store their elements in a global order. Dense fragments store their elements into regularised chunks in the index space.
Isolation Levels[07]
TileDB provides no transaction support in the current version, and no isolation could be guaranteed.
Query Interface[09][10]
TileDB supports API for the following languages: SQL, C, C++, Python, Java, R, and Go. TileDB Python API is under further development, and subject to change.
Storage Architecture[11]
TileDB uses a disk-oriented storage format that can store dense and sparse array data and support fast updates.
GenomicsDB
Citations
13 sources- TileDB • Designed for Discovery tiledb.com
- https://github.com/TileDB-Inc/TileDB github.com
- https://docs.tiledb.com/main tiledb.com
- https://docs.tiledb.io/en/stable/index.html tiledb.io
- Academy • TileDB tiledb.com
- Academy • TileDB tiledb.io
- Questions about tileDB features - TileDB Forum tiledb.com
- https://people.csail.mit.edu/stavrosp/papers/vldb2017/VLDB17_TileDB.pdf mit.edu
- https://tiledb.io/press/tiledb-presto tiledb.io
- Academy • TileDB tiledb.com
- https://docs.tiledb.io/en/stable tiledb.io
- https://docs.tiledb.io/en/stable/introduction.html?highlight=distributed tiledb.io
- https://tiledb.com/blog/tiledb-a-refresher-on-what-and-why tiledb.com