InfiniDB is a column-store DBMS optimized for OLAP workloads. It has a distributed architecture to support Massive Paralllel Processing (MPP). It uses MySQL as its front-end such that users familiar with MySQL can quickly migrate to InfiniDB. Due to this fact, users can connect to InfiniDB using any MySQL connector.
InfiniDB applies MVCC to do concurrency control. It uses term System Change Number (SCN) to indicate a version of the system. In its Block Resolution Manager (BRM), it utilizes three structures, Version Buffer, Version Substitution Structure (VSS), and Version Buffer Block Manager, to manage multiple versions. InfiniDB applies deadlock detection to resolve conflicts.
InfiniDB stores all DDL and DML statements as transaction log, which is essentially command logging. Every 10 minutes, InfiniDB archives transaction logs to a file named "data_mods.log.timestamp". The file name and file location can be configured. On system crashes, those logs can be replayed with Version Buffer, which stores information about all uncommitted transactions.
Since InfiniDB's columnar storage model and range-partitioned storage of each column, each table is already column-wise and row-wise partitioned. Thus, InfiniDB does not store any materialized views to save space and reduce maintenance difficulty. However, it still supports virtual views to be consistent with MySQL syntax.
InfiniDB uses MySQL syntax, which is a SQL interface.
InfiniDB suppports two architectures, shared-disk or shared nothing. It can be configured by the administrator while setuping the system. In shared-nothing mode, some functions of InfiniDB are not accessible, including suspendDatabaseWrites and system. suspendDatabaseWrites is used as the first step to backup data. It needs to be explicitly called by the system administrator before backuping. system is used to execute a shell command.
InfiniDB can be configured to run in different operation mode. In generic mode, its query execution utilizes Tuple-at-a-Time model. In distributed mode (default mode), every job step of the execution plan has a input data list and output data list. Data list has defined iterator and next method. In the meanwhile, every job step maintains a input row group and output row group. In the source code, class TupleDeliveryStep has a virtual function nextBand. Each of its sub-class implements this method to delivery a batch of rows. Essentially, this is a vectorized model that processes several rows at a time.
InfiniDB is a columnar DBMS. For each column, InfiniDB applies range partitioning and stores the minimum and maximum value of each partition in a small structure called Extent Map. In InfiniDB, Extent Map is only updated when the first query happens after data manipulation. rather than at the point of data manipulation. The columnar storage model and range partitioning of each column already vertically and horizontally partitions a table, so InfiniDB does not use indices.
Since InfiniDB is an analytic database optimized for OLAP workloads, columnar storage model is a better choice. In this way, I/O activities for selective queries can be reduced as it only needs to fetch relevant columns.
InfiniDB adopts Read Committed isolation level which means that only committed data can be read by users. Every operation in InfiniDB only runs on a snapshot of the system, which achieves a Read Committed Snapshot behavior. Read Committed Snapshot is still Read Committed isolation level, but guarantees that reads are not blocked by writes because it works on a snapshot version.
InfiniDB supports different operation mode. In generic mode, joins are processed by mysqld process rather than InfiniDB. All joins are evaluated using nested-loop pattern in this mode. In distributed mode, InfiniDB supports hash join and merge sort join. It also automatically marks a join as semi join if it is applicable.
InfiniDB uses MySQL as its front-end and supports all MySQL syntaxes, including foreign keys.