OceanBase

Viewing Revision #12 from 2019-11-08 23:35 View Current

OceanBase is a distributed, scalable, shared-nothing relational DBMS developed by Alibaba. The goal of OceanBase is to serve for financial scenarios which is demanding on performance, cost, scalability and requires database with high availability and strong consistency. It is designed and optimized for diverse OLTP applications on relational structured data, though its shared-nothing structure also supports OLAP applications.[04]

Logo Versions

Website: https://oceanbase.alipay.com[01]
Source Code: https://github.com/alibaba/oceanbase[02]
Tech Docs: https://oceanbase.alipay.com/docs[03]
Developer: Alibaba Group
Country of Origin: CN
Start Year: 2013
Project Type: Commercial
Written in: C++
Supported Languages: C, C++, Java
Compatible With: MySQL
Operating Systems: Hosted, Linux
License: GPL v2

Logo Versions

Website: https://oceanbase.alipay.com[01]
Source Code: https://github.com/alibaba/oceanbase[02]
Tech Docs: https://oceanbase.alipay.com/docs[03]
Developer: Alibaba Group
Country of Origin: CN
Start Year: 2013
Project Type: Commercial
Written in: C++
Supported Languages: C, C++, Java
Compatible With: MySQL
Operating Systems: Hosted, Linux
License: GPL v2

OceanBase

Viewing Revision #12 from 2019-11-08 23:35 View Current

History[05]

In 2010, OceanBase team leader Zhenkun Yang joined Alibaba. Because of the increasing concurrency in Alibaba's business and the shortened development cycle to build a database for new transaction, Yang found that original DBMS can't support for rapidly growing workloads in Alibaba. He decided to abandon the traditional DBMS framework and develop a novel DBMS from scratch. At the very beginning, he presented three kernel principles for their new products: (1) distributed (2) low cost (3) high reliability.
In 2013, Alipay decided to abandon Oracle. Since MySQL can't ensure strong consistency between active server and standby server, OceanBase got its first opportunity. From now on, OceanBase is not open sourced anymore.
From 2014 to 2016, this team spend three years developing OceanBase 1.0. It is the first and only commercial DBMS which supports distributed transactions.
From 2017, OceanBase started to serve for external customers.
In 2019, OceanBase beat Oracle and won the first place in TPC-C test.

Checkpoints[06][07]

Blocking

OceanBase blocks the ObServer and takes snapshots when executing major compaction or minor compaction during low peak period at night or when the size of MemTable is about to out of memory.

Compression[08]

Dictionary Encoding Delta Encoding Run-Length Encoding Prefix Compression

OceanBase uses column compression. It implements several encoding algorithms and it will automatically choose the most suitable one for every column. It costs only half as much space as MySQL does.

Concurrency Control[09]

Multi-version Concurrency Control (MVCC)

OceanBase adopts MVCC to do concurrency control. If the operation involves single partition or multiple partitions on one ObServer, it will read the snapshot of that ObServer. If the operation involves partitions on multiple ObServers, it executes distributed snapshot read.

Data Model[03]

Relational

OceanBase supports relational data model, just like MySQL.

Foreign Keys[10]

Supported

OceanBase supports foreign key to constrain data consistensy.

Indexes[11]

B+Tree Hash Table

For index structure, the only available value for parameter index type in OceanBase is B+Tree when creating index.
For index range, as OceanBase splits table into partitions, it supports local index for local partitioned table and global index for global table.

Isolation Levels[12][01]

Read Committed Serializable Snapshot Isolation

From OceanBase 1.0, it supports read committed. Read committed is the default isolation level.
From OceanBase 2.0, it supports snapshot isolation.
From OceanBase 2.2, it supports serializable.

Joins[13]

Nested Loop Join Hash Join Sort-Merge Join

OceanBase currently support three join algorithms: Nested Loop Join, Merge Join, Hash Join

Logging[14]

Logical Logging

OceanBase uses logical redo log to records all the modification on MemTable. It uses Paxos consensus algorithm to synchronize log replicas on different server nods.

Parallel Execution[15]

Intra-Operator (Horizontal) Inter-Operator (Vertical)

OceanBase supports both vertical and horizontal parallelism.

Query Compilation[16]

Code Generation

OceanBase implements code generator to translate the logical execution plan into physical execution plan. OceanBase caches these plans to improve performance.

Query Execution

Tuple-at-a-Time Model

OceanBase uses iterator model to executes queries.

Query Interface[17]

SQL

OceanBase supports standard SQL query interface, though there are slight differences.
The detailed OceanBase SQL syntax doc can be found in citations.

Storage Architecture[18][19][20]

Disk-oriented

OceanBase is a distributed disk-oriented DBMS.
From the perspective of storage management, OceanBase is divided into multiple Zones. Each Zone is a collection of physical server nodes. Several Zones would store the same replica and synchronize logs using Paxos distributed consensus algorithm. Each Zone has multiple server nodes, ObServers. OceanBase also supports horizontal partitions and automatically balance partition load across ObServers. There are two kind of blocks for data file storage, Macro Block and Micro Block. Macro Block(2MB) is the smallest unit for write operation. Micro Block(16KB before compression) is the smallest unit for read operation.
From the perspective of resource management, each database instance would be considered as a tenant in OceanBase. Every tenant is allocated with a unit pool containing units. Each unit is a group of computation and storage resource on a ObServer. Each tenant can have at most one unit on one ObServer. Conceptually, unit is receptacle for replica.
OceanBase implements block cache for Micro Block to accelerate big scan query. It also implements a row cache for rows in block cache to accelerate small get query.
The storage data structure of OceanBase is designed based on LSM-Tree in LevelDB. The data modification is first recorded in MemTable (dynamic data in memory) using redo linked list, and the head is linked to the corresponding block in block cache. During the low peak period at night or when the size of MemTable reaches the threshold, OceanBase will merges the MemTable to SSTable(static data in disk) using one of following merge algorithms:
(1) Major Compaction: Read all the static data from disk, merge it with the dynamic data and then write back to disk as new static data. This is the most expensive algorithm and will typically be used by OceanBase after DDL operation.
(2) Minor Compaction: Reuse all the Macro Block which are not written. For the dirty Macro Block, directly copy the Micro Block which are not written. This is the default algorithm OceanBase adopts.
(3) Alternate Compaction: Zones store the replicas which is about to merge data will block and merge alternately. When one Zone is merging data, queries on the merged replica will be sent to other Zones that store this replica. This Zone will also warm the cache after compaction. When having to merge data during peak period, OceanBase adopts this algorithm. This algorithm is orthogonal to minor compaction and major compaction and should be used in combination with one of them.
(4) Dump: Dump the MemTable to disk as Minor SSTable and merge it with the previous dumbed Minor SSTable. When the size of Minor SSTable is large enough, merge it to SSTable using aforementioned compaction algorithm. This lightweight approach is used when the dynamic data is significantly less than static data.

Storage Model[19]

Hybrid

From OceanBase 2.0, it supports hybrid storage model. Attributes belong to the same tuple are still stored in the same block, but the tuples in the same block are compressed and stored in columnar model.

Storage Organization

Heaps

Stored Procedures[21]

Supported

From OceanBase 2.0, Stored Procedure written in SQL is supported.

System Architecture[22]

Shared-Nothing

OceanBase adopts shared-nothing system architecture. It will store replica of each data in at least three ObServers in different Zones. Each server node have its own SQL engine and storage engine. The storage engine can only access the local data on that node. The SQL engine can access the global schema and generate the distributed query plan. Query executors visits the storage engine of each node to distribute and collect data among them to complete the query. For each database instance, it sets one server node as active root server to provide root service which can monitor the health of all the nodes related this database. The root service is responsible for load balance, data consistency, error recovery, etc. If this primary root server shuts down, OceanBase will automatically promotes one standby root server to become new active root server.

Views[23]

Materialized Views

OceanBase supports materialized view.
Their first business, Taobao Favorites, is done by leveraging materialized views.

Citations

23 sources

https://oceanbase.alipay.com alipay.com Dead — Check Archive Accessed: 2026-06-04
GitHub - alibaba/oceanbase · GitHub github.com Accessed: 2026-06-04
https://oceanbase.alipay.com/docs alipay.com Dead — Check Archive Accessed: 2026-05-27
https://xblk.ecnu.edu.cn/EN/abstract/abstract25014.shtml ecnu.edu.cn Dead — Check Archive Accessed: 2026-06-01
蚂蚁金服阳振坤：OceanBase如何跨越关系数据库的“死亡之谷”-阿里云开发者社区 aliyun.com Accessed: 2026-06-01
https://oceanbase.alipay.com/docs/oceanbase/OceanBase概览/存储引擎/bbg4q0 alipay.com Dead — Check Archive Accessed: 2026-05-27
https://zhuanlan.zhihu.com/p/86186256 zhihu.com Dead — Check Archive Accessed: 2026-06-02
https://oceanbase.alipay.com/docs/oceanbase/OceanBase概览/存储引擎/gheu26 alipay.com Dead — Check Archive Accessed: 2026-05-27
https://oceanbase.alipay.com/docs/oceanbase/OceanBase概览/分布式架构/xf9bc0 alipay.com Dead — Check Archive Accessed: 2026-05-27
https://oceanbase.alipay.com/docs/oceanbase/OceanBase概览/模式与数据模型/ak4nzl alipay.com Dead — Check Archive Accessed: 2026-05-27
https://oceanbase.alipay.com/docs/oceanbase/OceanBase SQL参考/SQL语句/sr-ss-index alipay.com Dead — Check Archive Accessed: 2026-05-27
https://zhuanlan.zhihu.com/p/78402011 zhihu.com Dead — Check Archive Accessed: 2026-06-02
https://oceanbase.alipay.com/docs/oceanbase/OceanBase SQL调优指南/连接与子查询/atv00s alipay.com Dead — Check Archive Accessed: 2026-05-27
https://oceanbase.alipay.com/docs/oceanbase/OceanBase管理员手册/OceanBase数据库对象/ege5ch alipay.com Dead — Check Archive Accessed: 2026-05-27
https://oceanbase.alipay.com/docs/oceanbase/OceanBase SQL调优指南/分布式执行计划/ave0y6 alipay.com Dead — Check Archive Accessed: 2026-05-27
https://oceanbase.alipay.com/docs/oceanbase/OceanBase概览/SQL引擎/kq3y7y alipay.com Dead — Check Archive Accessed: 2026-05-27
https://oceanbase.alipay.com/docs/oceanbase/OceanBase SQL参考 alipay.com Dead — Check Archive Accessed: 2026-05-27
https://oceanbase.alipay.com/docs/oceanbase/OceanBase概览/分布式架构/pmoq3h alipay.com Dead — Check Archive Accessed: 2026-05-27
https://oceanbase.alipay.com/docs/oceanbase/OceanBase概览/存储引擎/uocyqs alipay.com Dead — Check Archive Accessed: 2026-05-27
Log-structured merge-tree - Wikipedia wikipedia.org Modified: 2026-03-26 Accessed: 2026-06-14
https://zhuanlan.zhihu.com/p/48618697 zhihu.com Dead — Check Archive Accessed: 2026-06-02
https://oceanbase.alipay.com/docs/oceanbase/OceanBase概览/分布式架构/rwrce8 alipay.com Dead — Check Archive Accessed: 2026-05-27
https://zhuanlan.zhihu.com/p/47609633 zhihu.com Dead — Check Archive Accessed: 2026-06-02

Revision #12 Last Updated: 2019-11-08 18:35