Cubrid

View Current Viewing Revision #8 from 12/10/2018 2:46 p.m.

CUBRID is an open-source object-relational OLTP database management system under GPL/BSD license, developed and owned by NHN Corporation. It is optimized for web services with its multi-threaded, multi-server architecture, native borker middleware, Multiversion Concurrency Control, cost-based query optimizer and High Availability feature.

History

In 2005, CUBRID project was launched in South Korea, motivated by NHN's need to support its over 10,000 servers around the world, while third-party solutions incurred high license cost and development difficulties. In 2008, CUBRID launches its first stable release (v1.0, later renamed as 8.1.1) and became an open source project.

Data Model

Object-Relational

CUBRID is an object-relational database management system, that is, it is essentially similar to a relational database, but with an object oriented model. In terms of the object-relational features provided, CUBRID supports collection data types such as set, multiset and list, which allows columns to store multiple values of the same data type. Columns can also be of custom data types defined via classees, which supports class inheritance and multiple inheritances. Additionally, CUBRID tables can also be viewed as classes, and can inherit other tables.

Stored Procedures

Supported

CUBRID supports stored functions and procedures in Java to enable complicated logic infeasiable otherwise through SQL.

Indexes

B+Tree

CUBRID uses B-trees for indexes files. It supports indexes and unique indexes, and two types of special indexes called filtered index and function-based index. Filtered index is used for sorting, searching and operating a well-defined partials set for a table. Only a subset of data that meet the condition is indexed. Function-based index operates based on a combination of values using a specific function.

Query Execution

Materialized Model

CUBRID is an OLTP database system that uses materialized query processing model. After query parsing, CUBRID generates a query plan as a XASL (Extended Accesss Specification Language) tree, where each XASL node represents an operator that scans its data, evaluates it against predicates and outputs a result.

Storage Model

N-ary Storage Model (Row/Record)

As an object-relational, OLTP database, CUBRID uses N-ary storage model, and follows slotted page layout scheme. Large object which does not fit in a single page is placed in overflow pages.

Storage Organization

Heaps

CUBRID utilizes Heap File Organization. Upon insertion, objects are associated with an unique Object Identifier (OID) and placed in a heap.

System Architecture

Shared-Nothing

CUBRID has a 3-tier architecture, consisting of the database server in a multi-threaded client/server architecture, the broker middleware which replays the communication between the database server and external applications, and its API endpoints. Additionally, CUBRID can also be in standalone mode, where client and server processes are combined into one.

CUBRID offers High Availability (HA) feature, which synchronizes data among multiple servers to provide uninterrupted services when a software, hardware or network failure occurs in one of them. An HA group consists of master nodes as active servers, slave nodes as standby servers providing read services only, and replica nodes. When the master fails, replica node will no failover and the slave node with the highest priority becomes the new master. CUBRID HA is in shared-nothing architecture.

Storage Architecture

Disk-oriented

CUBRID is a disk-oriented database, memory is used for caching purposes. User can configure several disk-related parameters for database volumes and files, such as database volume size and log volume size.

Checkpoints

Non-Blocking

CUBRID checkpoint can be triggered by two conditions: periodically when a fixed time interval is up, or when the log page size has exceeded a certain value. Thresholds for both triggers can be configured.

Foreign Keys

Supported

CUBRID supports foreign key constraint, i.e. a set of columns that references the primary key in other tables. To maintain referential integrity, upon modification of the corresponding primary key, the foreign key can either be set to NULL or deleted. Foreign key constraint can also be set as such that its primary key is prevented from being modified at all, and any transaction attempting modification is rolled back.

Concurrency Control

Multi-version Concurrency Control (MVCC)

Since v10.0, CUBRID moved from two phase locking protocol to a Multiversion Concurrency Control (MVCC) protocol. Under MVCC, readers are not blocked from accessing data currently being modified by other transactions. This support for non-blocking reads is ideal for read intensive applications. Additionally, MVCC also allows point-in-time consistent view and true snapshot isolation. CUBRID resolves deadlocks via deadlock detection.

Query Interface

Custom API SQL Stored Procedures Command-line / Shell

CUBRID complies with SQL-92 standard and supports extended SQL syntaxes. CUBRID provides a GUI-based CUBRID Manager called CUBRID Manager, and a console-based CSQL Interpreter. It also offers APIs in various languages with corresponding driver, including JDBC Driver, CCI (C-based application) Driver, PHP/PDO (PHO Data objects) Driver, ODBC Driver, OLE DB (Object Linking and Embedding Database) Driver, ADO.NET Driver, Perl Driver, Python Driver, Ruby Driver and Node.js Driver.

Logging

Logical Logging Physical Logging

CUBRID implements both physical logging and logical logging, and is under Write Ahead Logging protolcol. It also supports group commit and asynchronous commit. Additionally, recovery process is using REDO/UNDO recovery protocol.

Query Compilation

Not Supported

CUBRID does not support query compilation.

Joins

Nested Loop Join Sort-Merge Join Index Nested Loop Join

CUBRID query optimizer selects the join algorithm for a candidate query. Three join algorithms options are sort merge join, nested loop join and index nested loop join. To view the join algorithm used for current query, user can set the optimization level such as optimization is turned on and query plan gets outputted.

Isolation Levels

Read Committed Serializable Snapshot Isolation Repeatable Read

CUBRID supports true snapshot isolation under the Multiversion Concurrency Control (MVCC) protocol. CUBRID provides 3 isolation levels: Serializable, Repeatable read, and Read committed, where Read committed is the default.

Views

Virtual Views

CUBRID supports creation of views via statement CREATE VIEW. The virtual tables does not exist physically, but can be updatable if the FROM clause only includes updateable table or view, and do not contain any of the following statement: DISTINCT, UNIQUE, GROUP BY, or any aggregation function. Additionally, columns containing path expressions or numeric columns with arithmetic operators cannot be updated even if the above criteria are met.