TileDB

View Current Viewing Revision #37 from 05/01/2023 5:35 p.m.

NoSQL

TileDB is an embedded storage engine designed to support the storage and access of both dense and sparse multi-dimensional arrays. The key idea of TileDB is that it stores array elements into collections called fragments, which can be either dense or sparse. Each of these fragments stores data in data tiles. In the case of dense fragments, the capacity of data tiles is limited by a fixed chunk size. In the case of sparse fragments, the capacity of data tiles is limited by a fixed element size. TileDB also supports parallel I/O and is completely multi-threaded. TileDB is designed to store many different types of data, such as genomic data, machine learning model parameters, imaging data, and LiDaR data.

TileDB Inc. also has a cloud offering called TileDB Cloud SaaS, which is a closed-source offering of TileDB with additional features, such as serverless UDFs and task graphs to build custom workflows of TileDB tasks. The architecture of TileDB Cloud is centered around a REST API service, and uses their embedded, open-source storage engine.

History

TileDB was invented at the Intel Science and Technology Center for Big Data in collaboration with Intel Labs and MIT. The research project was published in a VLDB 2017 paper. TileDB, Inc. was founded in February 2017 to further develop and maintain the DBMS.

Storage Model

Decomposition Storage Model (Columnar)

TileDB uses a decomposition storage model (DSM) to store attribute data. This attribute data is stored in global order on disk. Global order is determined by tile order, and then cell order. Recall that TileDB arrays have multiple (say, n) dimensions. Each of these dimensions comes with a tile extent. The tile extents of the dimensions determine the size of the tile, and effectively groups the data into smaller blocks. Tile order orders these blocks of data (which can be either row-major or column-major order) and cell order orders the cells within a tile.

Storage Organization

Sorted Files

TileDB's storage manager stores data according to the coordinate of the cell value being inserted, which would make the sorted files model closest to the its storage manager implementation.

Stored Procedures

Not Supported

System Architecture

Embedded

TileDB is a embeddable storage library.

Views

Not Supported

Checkpoints

Not Supported

TileDB does not support checkpoints.

Compression

Dictionary Encoding Delta Encoding Run-Length Encoding

TileDB supports compressors, which operate on data tiles. The types of compressors it supports include bzip2, dictionary, double-delta, gzip, LZ4, RLE, and Zstandard. It also supports a few data filters that reduce data size, such as the bit width reduction filter, float scaling filter, positive delta filter, and WebP filter.

We detail the custom compressors in the section below:

• The double delta compressor uses the timestamp data compression scheme first mentioned in the VLDB paper on the Gorilla time series DBMS. However, TileDB's compressor uses a fixed bit-size instead of a variable bit-size.

• The dictionary encoding filter is a lossless compressor that computes a dictionary of all the unique strings in the input data and stores the indexes of the dictionary instead of the strings themselves in memory.

• The bit width reduction filter takes in input data with an unsigned integer type and compresses them to a smaller bit width if possible.

• The float scaling filter is a lossy compressor takes in input data with a floating point type. Along with arguments for a scale factor, an offset factor, and a byte width, the filter computes round((input_data[i] - offset) / scale), casts it to an integer type with the specified byte width, and stores that in main memory.

• The positive delta filter is a delta encoding filter that ensures that it only stores positive deltas.

• The WebP filter takes raw colorspace values and converts them to WebP image format. This filter supports lossy compression of imaging data.

Concurrency Control

Not Supported

TileDB does not provide transactional support, as it is a storage engine. It only guarantees atomic reads and writes. TileDB also supports data versioning, which is not MVCC, but can provide some of the functionality of MVCC. Support for data versioning within TileDB is built into the file format. The TileDB file format stores an array write as a separate fragment, which includes timestamp information. With this information, it is possible to read an array that has writes only within a specified time interval.

Data Model

Array / Matrix

TileDB's data model supports the storage of both dense and sparse arrays.

The data model of TileDB arrays allows it to support any number of dimensions. For dense arrays, the dimension types must be uniform, and they all must be either integer types, datetime types, or time types, which are all internally stored as integer types. TileDB only supports integer type dimensions for dense arrays to allow coordinates to be implicitly defined. For sparse arrays, the dimension types in a domain can be heterogeneous (e.g. float or string), and coordinates are explicitly stored in memory. A set of dimensions for an array is called a domain.

An array element ("cell") is defined by a unique set of dimension values or coordinates. In dense arrays, all cells must store exactly one value. In sparse arrays, cells can be empty, store one value, or store multiple values. Each logical cell contains the data from the defined attributes in the array schema. Attributes can have heterogeneous types for both sparse and dense arrays.

Foreign Keys

Not Supported

TileDB does not support foreign keys.

Indexes

R-Tree

TileDB uses an R-tree as an index to implement sparse array slicing. On write, TileDB builds an R-tree index on the non-empty cells of the sparse array. To do this, it groups the coordinates of the non-empty cells into minimum bounding rectangles, then recursively groups these rectangles into a tree structure. On read, TileDB determines which minimum bounding rectangles overlap the query coordinates. Then, it uses parallel processing to collect these rectangles, decompress them, individually check the coordinates of the data collected, and retrieve the attribute data that matches the query.

Isolation Levels

Not Supported

TileDB does not provide transaction support currently, so no transaction isolation is guaranteed.

Joins

Not Supported

TileDB does not support join operations currently.

Logging

Not Supported

TileDB does not support logging currently.

Parallel Execution

Intra-Operator (Horizontal)

TileDB uses intra-operator parallel execution for both its read and write queries. The main operations in which TileDB uses heavyweight parallelization on are reading/writing I/O and tile filtering/unfiltering. When executing I/O tasks, the reading/writing is parallelized per attribute, and each attribute is parallelized per data tile. When executing tile filtering tasks, the filtering/unfiltering is parallelized per attribute, each attribute is parallelized per data tile, and each data tile is parallelized per filter chunk. A chunk is a size parameter that defaults to 64KB.

This parallelism is implemented via static thread pools. TileDB uses both a compute task thread pool and an I/O task thread pool to help parallelize execution. It includes two thread pools to ensure that I/O tasks do not overload CPU-bound tasks during execution. Internally, a TileDB thread pool is an array of std::thread, and tasks to be executed with this thread pool are kept in a queue.

Query Compilation

Not Supported

TileDB does not support query compilation currently.

Query Interface

Custom API SQL

TileDB has APIs in the following languages: C, C++, C#, Python, Java, R, and Go. One can use three methods to run SQL on top of TileDB. First, one can use TileDB-SQL-Py, a Python package that allows users to run SQL queries in the Python environment. In addition, the MariaDB client REPL, TileDB-Presto connector, and TileDB-Trino connector can be invoked to run SQL queries directly.

Storage Architecture

Disk-oriented In-Memory

By default, TileDB uses a disk-oriented oriented storage architecture (POSIX filesystem or HDFS). TileDB also supports data storage on object stores such as AWS S3, Azure Blob Storage, Google Cloud Storage and Minio. TileDB can be configured to store data in-memory via a RAM backend.

Storage Format

Apache Arrow

TileDB supports interoperation functionality with Apache Arrow.

TileDB's main storage format is a multi-file format that stores the array schema, fragments, consolidated fragment metadata, commits, and the array metadata. The array schema directory contains multiple files, each of which is labelled with a timestamp. TileDB supports array schema modification and thus this is needed in order to access data at different times using the appropriate schema. The fragments stored are timestamped writes to TileDB arrays. Each fragment has its own directory. In this directory, the attribute and dimension data are stored, as well as the fragment metadata, which is a file that contains important data about the fragment, such as the name of its array schema and index information. The consolidated fragment metadata is stored mainly as a read query optimization, and this small file contains the footers of all the fragment metadata files. The commit files mainly serve as indicator files that fragment creation was successful. Lastly, the array metadata files store user-defined key value pairs that can be extracted by querying the TileDB array.

Revision #37 | Updated 05/01/2023 5:35 p.m.