IoTDB is a specialized database management system for time series data generated by a network of IoT devices with low computational power. It targets a workload that has high-frequency data write, large-volume data storage, and complex analytical queries. IoTDB supports queries that are common in monitoring and collecting metrics in IoT devices, namely filtering by predicates, query by time range, group aggregation, and data sample. Data in IoTDB is stored in TsFile, a file format designed for accessing, compressing, and storing time series data. Its storage is organized in LSM based structure catering to write throughput.
IoTDB supports a Java and Python APIs, as well as a command-line interface. IoTDB provides supports data analysis systems such as Spark, Hadoop, Hive, and Grafana.
IoTDB is a project started in 2017 by Prof. Jianmin Wang’s group in the School of Software of Tsinghua University and China’s National Engineering Laboratory for Big Data Software. The project entered incubation in Apache Foundation in November 2018.
The project evolves from a prior project of the same group called TsFile. TsFile is a columnar storage format optimized for storing time series data. IoTDB uses TsFile as its underlying storage format.
There is not an explicit checkpointing operation in IoTDB; whereas, crash recovery is handled with continuous recycle of WAL. IoTDB makes assumptions on the workload and access pattern of queries: it assumes that writes are mostly single record insert, updates on device metrics are rare, and there is no multi-query transaction.
Every write operation to the database occurs in an in-memory buffer data structure named memtable. Each memtable corresponds to a WAL file. Once a memtable persists on disk, the WAL corresponds to it will be deleted. This is essentially a checkpoint, as all the log exists are all logs that the database needs to replay when recovering from a crash, and all the logs that are deleted belong to queries that have already been committed.
Delta Encoding Run-Length Encoding Naïve (Record-Level)
Encoding
IoTDB uses different encoding methods for different data types.
Suitable for the sequence of integer values and low-precision floating-point values that appear monotonic.
Default encoding for time series data.
Suitable for fixed interval increasing sequence like time series.
Suitable for floating-point values with small variance.
Compression
After encoding, data is cast to a binary stream; the binary stream is then compressed with SNAPPY.
As IoTDB does not support transaction, it has a bare-bone concurrency control implementation with read locks and write locks. Their implementation does not follow a Two-Phase Locking protocol, as there are cases where a lock is acquired after another lock is released previously in the same function, and example is included in citation 1 of this section. IoTDB uses Java's native ReentrantReadWriteLock in the implementation.
To avoid access conflict when concurrently reading or writing to user or role, IoTDB has HashLock implemented for user manager and lock manager. A HashLock lock is a wrapper around a fixed number of ReentrantReadWriteLock locks. By default, it initializes with an array of 100 ReentrantReadWriteLock locks. Each applicable database object corresponds to one lock in the array, according to hash value of the object. This avoids conflicts resulted from concurrent access of same database object, user or role in this case, while in the same time limit the amount of resource needed to managing those locks.
Storage groups and time series in IoTDB can be created with any number of prefixes, which decides the depth of the node in the hierarchy tree and the path to its storage location. With similar prefixes, time series from one storage group can be continuously written to same file. Uses of prefixes in time series name in IoTDB also allows users to issue a coarser-grained query on data from certain level of hierachy.
KV-Match Index for pattern matching queries and PISA for aggregation queries.
SQL-like customized query language. It has different naming conventions of database objects. Storage Group can be assimilated to a table but can be expressed in a tree hierarchy with prefixes in the path. A time series can be assimilated to a column; in IoTDB, it is usually an attribute of a device, containing a sequence of pairs of timestamp and corresponding values.
IoTDB uses Antlr 4 to translate query statements to logical plan operators and then to physical plan operators.
Decomposition Storage Model (Columnar)
IoTDB's underlying TsFile is a columnar storage file format. It is similar to CarbonData and Parquet but designed for time series data.
The overall IoTDB follows a client-server architecture. IoTDB client resides in the sensors(IoT devices) of the system, handling data collection and sending data to IoTDB server. Client can sync its data collected every user-configured interval with the server using Sync Tool; this allows data collected by the sensor to constantly being persisted in server, where the data can then be used for native query or shipped to other open-source platform for data analysis. Currently support single node server deployment. The group is working in progress to support shared-nothing cluster. IoTDB currently supports writing to HDFS.
https://github.com/apache/incubator-iotdb
https://iotdb.incubator.apache.org/#/Documents
Tsinghua University
2017