InfiniDB is a column-store DBMS optimized for OLAP workloads. It has a distributed architecture to support Massive Paralllel Processing (MPP). It uses MySQL as its front-end such that users familiar with MySQL can quickly migrate to InfiniDB. Due to this fact, users can connect to InfiniDB using any MySQL connector.
Since InfiniDB's columnar storage model and range-partitioned storage of each column, each table is already column-wise and row-wise partitioned. Thus, InfiniDB does not store any materialized views to save space and reduce maintenance difficulty. However, it still supports virtual views to be consistent with MySQL syntax.
Tuple-at-a-Time Model Vectorized Model
InfiniDB can be configured to run in different operation modes. In "generic" mode, its query execution utilizes tuple-at-a-time processing model. In "distributed" mode (default), every job step of the execution plan has a input data list and output data list. Data list has defined iterator and next method. In the meanwhile, every job step maintains a input row group and output row group.
Multi-version Concurrency Control (MVCC) Two-Phase Locking (Deadlock Detection)
InfiniDB applies MVCC to do concurrency control. It uses term System Change Number (SCN) to indicate a version of the system. In its Block Resolution Manager (BRM), it utilizes three structures, Version Buffer, Version Substitution Structure (VSS), and Version Buffer Block Manager, to manage multiple versions. InfiniDB applies deadlock detection to resolve conflicts.
Nested Loop Join Hash Join Sort-Merge Join Semi Join
InfiniDB supports different operation mode. In generic mode, joins are processed by mysqld process rather than InfiniDB. All joins are evaluated using nested-loop pattern in this mode. In distributed mode, InfiniDB supports hash join and merge sort join. It also automatically marks a join as semi join if it is applicable.
Decomposition Storage Model (Columnar)
Since InfiniDB is an analytic database optimized for OLAP workloads, columnar storage model is a better choice. In this way, I/O activities for selective queries can be reduced as it only needs to fetch relevant columns.
InfiniDB adopts Read Committed isolation level which means that only committed data can be read by users. Every operation in InfiniDB only runs on a snapshot of the system, which achieves a Read Committed Snapshot behavior. Read Committed Snapshot is still Read Committed isolation level, but guarantees that reads are not blocked by writes because it works on a snapshot version.
InfiniDB stores all DDL and DML statements as transaction log, which is essentially command logging. Every 10 minutes, InfiniDB archives transaction logs to a file named "data_mods.log.timestamp". The file name and file location can be configured. On system crashes, those logs can be replayed with Version Buffer, which stores information about all uncommitted transactions.
InfiniDB is a columnar DBMS. For each column, InfiniDB applies range partitioning and stores the minimum and maximum value of each partition in a small structure called Extent Map. In InfiniDB, Extent Map is only updated when the first query happens after data manipulation. rather than at the point of data manipulation. The columnar storage model and range partitioning of each column already vertically and horizontally partitions a table, so InfiniDB does not use indices.
https://github.com/infinidb/infinidb
2000
2014