Dictionary Encoding Delta Encoding Run-Length Encoding
TDengine is a column-wise database, and it can apply different compression algorithms (delta-delta coding, simple 8B method, zig-zag coding, LZ4 and others) on different data types.
TDengine adopts relational data model. But it requires the application to create a table for each data collection point (time-series, in some cases, it is a unique device) to enhance the data ingestion rate, query speed and data compression rate.
To support the aggregation for multiple data collection points (tables), TDengine provides a "Super Table" (STable) template for a category of data collections points. Besides the schema for metrics, it also includes the schema for tags (static attributes for a time-series). When creating a table, application uses a STable as template and assign the value to each tag. In other words, the table is associated with a set of tags. For aggregation, the tags will be used as a filter to get the set of tables which shall be scanned for query. It can improve the aggregation significantly.
Skip List Block Range Index (BRIN)
TDengine builds the index on timestamp automatically, but it does not provide index on other metrics. For tags, TDengine uses skip-list for indexing in the current release.
In memory, TDengine uses row-wise storage, but on disk, it uses column-wise storage.
In memory, TDengine saves the data in the same format as WAL(write-ahead-log), but it builds a skip-list, so each row can be accessed quickly. On disk, the data is partitioned based on time range, so the data can be easily removed if time-to-live is reached, and multi-tier storage can be supported.
TDengine partitions data across virtual nodes. TDengine partitions data in two dimensions: data collection points and time.
TDengine, a cluster has multiple nodes, each node can have one or more virtual nodes (vnode), each virtual node stores the data of a certain number of data collection points, and the data of a data collection point is always stored in only one vnode. This way, if there are many data collection points, the data from these data collection points will be distributed to multiple vnodes and distributed in multiple nodes. When data is ingested, the TDengine client directly writes the data to the corresponding vnode, thereby achieving horizontal scaling of data ingestion. For the query of the data of a single data collection point, horizontal scaling is obvious. The more nodes there are, the greater the throughput rate. For aggregation queries, the query request will be sent to the corresponding vnodes first, the vnodes will complete the aggregation operation, then the client will aggregate the query results from multiple vnodes for a second time. Because the number of vnodes is limited, aggregation queries require little computation on the client side, so the horizontal scalability capability of aggregation queries is achieved.
Partitioning: In addition to sharding the data, TDengine also divides the time series data stored in a vnode according to time periods. The data of each time period must be saved together, and the data of different time periods will not overlap. The time period can be one day or multiple days, which is defined by the user. Dividing time series data by time period has many advantages. When querying data, the file to be searched can be directly located according to the time period, thereby speeding up the query. On the other hand, data retention policies can be efficiently implemented. If the data that has been retained for the longest period of time is exceeded, the files corresponding to a period of time can be deleted directly. In addition, when data is segmented according to time periods, it becomes much easier to achieve multi-level storage and reduce storage costs further.
TDengine also provides high availability of the system through virtual node group technology. Vnodes on different nodes can form a virtual node group. The data in this virtual node group is synchronized through the Master-Slave structure to ensure the consistency of data in this virtual node group. Data writes can only be performed on the master, but queries can be performed on both the master and slave simultaneously. If the master node fails, the system automatically selects a new master node and continues to provide services and ensures high availability.