TileDB is an embedded storage engine designed to support the storage and access of both dense and sparse multi-dimensional arrays. The key idea of TileDB is that it stores array elements into collections called fragments, which can be either dense or sparse. Each of these fragments stores data in data tiles. In the case of dense fragments, the capacity of data tiles is limited by a fixed chunk size. In the case of sparse fragments, the capacity of data tiles is limited by a fixed element size. TileDB also supports parallel I/O and is completely multi-threaded. TileDB is designed to store many different types of data, such as genomic data, machine learning model parameters, imaging data, and LiDaR data.
TileDB Inc. also has a cloud offering called TileDB Cloud SaaS, which is a closed-source offering of TileDB with additional features, such as serverless UDFs and task graphs to build custom workflows of TileDB tasks. The architecture of TileDB Cloud is centered around a REST API service, and uses their embedded, open-source storage engine.
TileDB was invented at the Intel Science and Technology Center for Big Data in collaboration with Intel Labs and MIT. The research project was published in a VLDB 2017 paper. TileDB, Inc. was founded in February 2017 to further develop and maintain the DBMS.
Dictionary Encoding Delta Encoding Run-Length Encoding
TileDB supports compressors, which operate on data tiles. The types of compressors it supports include bzip2, dictionary, double-delta, gzip, LZ4, RLE, and Zstandard. It also supports a few data filters that reduce data size, such as the bit width reduction filter, float scaling filter, positive delta filter, and WebP filter.
We detail the custom compressors in the section below:
• The double delta compressor uses the timestamp data compression scheme first mentioned in the VLDB paper on the Gorilla time series DBMS. However, TileDB's compressor uses a fixed bit-size instead of a variable bit-size.
• The dictionary encoding filter is a lossless compressor that computes a dictionary of all the unique strings in the input data and stores the indexes of the dictionary instead of the strings themselves in memory.
• The bit width reduction filter takes in input data with an unsigned integer type and compresses them to a smaller bit width if possible.
• The float scaling filter is a lossy compressor takes in input data with a floating point type. Along with arguments for a scale factor, an offset factor, and a byte width, the filter computes round((input_data[i] - offset) / scale), casts it to an integer type with the specified byte width, and stores that in main memory.
• The positive delta filter is a delta encoding filter that ensures that it only stores positive deltas.
• The WebP filter takes raw colorspace values and converts them to WebP image format. This filter supports lossy compression of imaging data.
TileDB does not provide transactional support, as it is a storage engine. It only guarantees atomic reads and writes. TileDB also supports data versioning, which is not MVCC, but can provide some of the functionality of MVCC. Support for data versioning within TileDB is built into the file format. The TileDB file format stores an array write as a separate fragment, which includes timestamp information. With this information, it is possible to read an array that has writes only within a specified time interval.
TileDB's data model supports the storage of both dense and sparse arrays.
The data model of TileDB arrays allows it to support any number of dimensions. For dense arrays, the dimension types must be uniform, and they all must be either integer types, datetime types, or time types, which are all internally stored as integer types. TileDB only supports integer type dimensions for dense arrays to allow coordinates to be implicitly defined. For sparse arrays, the dimension types in a domain can be heterogeneous (e.g. float or string), and coordinates are explicitly stored in memory. A set of dimensions for an array is called a domain.
An array element ("cell") is defined by a unique set of dimension values or coordinates. In dense arrays, all cells must store exactly one value. In sparse arrays, cells can be empty, store one value, or store multiple values. Each logical cell contains the data from the defined attributes in the array schema. Attributes can have heterogeneous types for both sparse and dense arrays.
TileDB uses an R-tree as an index to implement sparse array slicing. On array write, TileDB builds an R-tree index on the non-empty cells of the sparse array. To do this, it groups the coordinates of the non-empty cells into minimum bounding rectangles, then recursively groups these rectangles into a tree structure. On read, TileDB determines which minimum bounding rectangles overlap the query coordinates. Then, it uses parallel processing to collect these rectangles, decompress them, individually check the coordinates of the data collected, and retrieve the attribute data that matches the query.
TileDB uses intra-operator parallel execution for both its read and write queries. The main operations in which TileDB uses parallelization on are reading/writing I/O and tile filtering/unfiltering. When executing I/O tasks, the reading/writing is parallelized per attribute, and each attribute is parallelized per data tile. When executing tile filtering tasks, the filtering/unfiltering is parallelized per attribute, each attribute is parallelized per data tile, and each data tile is parallelized per filter chunk. A chunk is a size parameter that defaults to 64KB.
This parallelism is implemented via static thread pools. TileDB uses both a compute task thread pool and an I/O task thread pool to help parallelize execution. It includes two thread pools to ensure that I/O tasks do not overload CPU-bound tasks during execution.
TileDB has direct APIs in the following languages: C, C++, C#, Python, Java, R, and Go. It is possible to run SQL queries on TileDB arrays. To expose array operators through SQL, TileDB Inc provides MyTile, a MariaDB storage engine that uses embedded TileDB. Then it is possible to connect this storage engine to a MariaDB instance, and run SQL queries on this MariaDB instance.
One can use three methods to run SQL on TileDB arrays. First, one can use TileDB-SQL-Py, a Python package that allows users to run SQL queries in the Python environment. In addition, the MariaDB client REPL, TileDB-Presto connector, and TileDB-Trino connector can be invoked to run SQL queries directly.
By default, TileDB uses a disk-oriented oriented storage architecture (POSIX filesystem or HDFS). TileDB also supports data storage on object stores such as AWS S3, Azure Blob Storage, Google Cloud Storage, and Minio. TileDB can be configured to store data in-memory via a RAM backend.
TileDB supports interoperation functionality with Apache Arrow.
TileDB's main storage format is a multi-file format that stores the array schema, fragments, consolidated fragment metadata, commits, and the array metadata. The array schema directory contains multiple files, each of which is labelled with a timestamp. TileDB supports array schema modification and thus the timestamp label is needed to access data at different times using the appropriate schema. In TileDB, array schema modification is when attributes can either be added or dropped after the array has been written into. The fragments stored are timestamped writes to TileDB arrays. Each fragment has its own directory. In this directory, the attribute and dimension data are stored, as well as the fragment metadata, which is a file that contains important data about the fragment, such as the name of its array schema and index information. The consolidated fragment metadata contains the footers of all the fragment metadata files. This file is stored as a read query optimization. When reading an array that has many fragments, retrieving all the fragment metadata footers from each fragment can be time-consuming. The commit files mainly serve as indicator files that fragment creation was successful. Lastly, the array metadata files store user-defined key value pairs that can be accessed by querying a TileDB array.
Decomposition Storage Model (Columnar)
TileDB uses a decomposition storage model (DSM) to store attribute data. This attribute data is stored in global order on disk. Global order is determined by tile order, and then cell order. TileDB arrays have multiple dimensions. Each of these dimensions comes with a tile extent. The tile extents of the dimensions determine the size of the tile, and effectively groups the data into smaller blocks. Tile order orders these blocks of data (which can be either row-major or column-major order) and cell order orders the cells within a tile.