Infobright is a column-oriented analytical DBMS engine for MySQL. The architecture of Infobright contains three parts: Data Pack(DP), DPN (Data Pack Node), KN (Knowledge Node) and those three parts generates the knowledge grid of Infobirght. When data is loaded into a table, the data is broken into different groups with fixed number rows and then decompose these data into separate data packs for each of the columns. As a result, each column has the same number of rows and this column structure is better at data compression compared with row-oriented database. The average data pack compression ratio is approximate 20:1 and the infobright can handle up to 50TB data for data analytics applications. One interesting about infobright is that it is more suitable for data analyze because it does not support INSERT, DELETE, UPDATE operations
Infobright was founded in 2005. It first released the open-source version of the DBMS in September 2008 and launched its community In July 2016, Infobright moved away from its open source community edition to direct customer markets and original equipment manufacturer (OME). The company was bought by a holding company in 2017 and is now in maintenance mode.
The API infobright supports are ODBC, JDBC, C API, C++, Delphi, Eiffel, Java, SmallTalk, Lisp, REALbasic, PHP, Visual Basic, Ruby, Perl and Python.
Infobright only support views but not materialized views. In addition to that, infobright also support approximate queries to reduce the query time for massive amount of data.
It supports ACID transactions with immediate consistency.
Since infobright do not support modification of tables, they do not support logging.
The storage Model for Infobright is DSM. Since infobright is more focused on store huge amount of data and increase the query speed, column orientation is more suitable. For the first reason, different from row based storage database where each metadata contains different data types, the column orientation database contains one data type and this property can help to optimize the compression algorithm for different types of data. In this way, infobright can get a market-leading data compression ratio (from 10:1 to 40:1) and greatly reduce the disk I/O. For the second reason, since most analytic queries only involve part of columns, so column orientation based DBMS can only focus on retrieving the needed data, which helps to improve the query speed of infobright.
Infobright support stored procedures. The language is their own store procedures, follow the MySQL Ansi-92 Standard. When using this language to define a stored procedure, use delimiter key word to define the procedure and change it back when the definition is finished. Below is sample code for stored Procedures for Infobright: "https://drive.google.com/open?id=0B1fwCLZ9xWQtYzh2ZVVDekV5NDg" .This function of this stored procedure is convert a date format string (“YYYYMMDD”) to a string (‘YYYY-MM-DD’).
Infobright does not support explicit indexes. The knowledge grid in Infobright serves as substitute for indexes as well as Data Pack Nodes (DPN). Each DPN contains some statistic information (such as max, min, sum) derived from the tuples that it stores. The knowledge grid store more advanced information (such as interdependence between multiple tables, multiple columns) and helps to locate the needed DPN with little decompress data as much as possible. For example, suppose a query wants to find such data which the value of certain column is within a specific range. Infobright's Optimizer can generate three type of Packs: Relevant Packs, Irrelevant Packs, Suspect Packs. Each query does not need to decompress relevant and irrelevant packs, and will only need to find other data in suspect packs. In this way, the DPN serves like the index. Also the knowledge grid also serves like index because it records the relationship between multiple tables. So for join search, the DBMS first uses information of DPN in both tables to find related data blocks, and then uses knowledge node to build the relationship between those data blocks. Both DPN and the knowledge grid avoids the need to maintain an index.
The infobright is Disk-oriented DBMS. But it indeed stores the knowledge grid in memory. The knowledge grid structure is automatically created and store the information of data when the data is uploaded or user execute some query. This knowledge grid is key structure for query and help to improve the query speed of infobright. If there is still space for RAM, this space can be used to store the uncompressed Data Packs. But most data packs and tuples are stored in disk. However, infobright is still disk-oriented DBMS and do not store all the data in memory.
Infobright does not support foreign key constrains.
Infobright is a relational DBMS.
Infobright is shared-nothing DBMS and it does not rely on special hardware. It combines a columnar database and knowledge grid for optimizing analytics (such as compressing, storing and retrieving data).