PolarDB

Viewing Revision #12 from 2022-06-29 14:25 View Current

PolarDB is a commercial cloud based relational database product developed by the Alibaba. It is designed for clients with high read demand. PolarDB is compatible with two popular databases: MySQL, PostgreSQL. It has three layers. Users interact with database through computing layer. PolarFS is a distributed file system and polarStorage as storage level. PolarDB uses InnoDB as storage engine.[02]

Logo Versions

Website: https://www.alibabacloud.com/en/product/polardb?_p_lc=1[01]
Developer: Alibaba Group
Country of Origin: CN
Start Year: 2017
Project Type: Commercial
Compatible With: MySQL, PostgreSQL
Operating System: Hosted
License: Proprietary

Database Entry

PolarDB

Viewing Revision #12 from 2022-06-29 14:25 View Current

History[03]

PolarDB was first released in September, 2017 and was commercialized officially in April, 2018. At the same time, Alibaba Cloud shared a talk about polarDB at Conference on Data Engineering (ICDE).

Checkpoints[04]

Non-Blocking

All modifications before a checkpoint must have been made to data chunks. Logs of changes committed after a checkpoint are also allowed to appear in a checkpoint. During recovery, it will choose the newest checkpoint instead of the longest one.

Concurrency Control[05][04]

Multi-version Concurrency Control (MVCC)

Primary node (read-write) and replica nodes (read-only) communicate through message sender and ack receiver. Each of them have a buffer pool. Replica would update itself during runtime redo operation. It uses difference between written log sequence number and replica's applied log sequence number as replica lag to keep track of the version that replica is holding.

In PolarFS, it uses parallel raft protocol to coordinate multiple data chunk servers. ParallelRaft is a consensus protocol inherited from Raft but it allows out-of-order I/O completion tolerance capabilities of database.

Data Model[04]

Relational

Hardware Acceleration[04]

RDMA

PolarDB uses RDMA to connect storage nodes and compute nodes. This removes the bottleneck for I/O performance.

Indexes[06][05]

B+Tree

Similar to MySQL, b+ tree is the default index data structure. B+Tree is ordered by primary key. One optimization related to B+Tree is that polarDB will record the location of last insertion to facilitate insertion next time. During parallel query execution which is a feature of polarDB, it will partition the B+tree to multiple workers. Each worker can only see its own partition. When one worker finished with one partition, it will automatically attach to a new partition.

Isolation Levels[05]

Repeatable Read

PolarDB maintains read view, which is an array of read write operations when a transaction starts. Replica nodes are read-only and therefore do not have read write operations. Primary node will send an initial read view to replica as part of handshake. It will be updated at redo.

Joins[05]

Nested Loop Join

Each worker will first scan and join their own partition. PolarDB will merge each worker's join result and return to clients.

Logging

Physical Logging

Query Interface

SQL

PolarDB supports the standard query interface SQL as MySQL does. It adds parallel query as a feature. To enable parallel query, there are multiple ways:

set max_parallel_degree = n

set force_parallel_mode = on

SELECT /*+ PARALLEL() */ * FROM ...

SELECT /*+ PARALLEL(n) */ * FROM ...

Storage Architecture[05][04]

Disk-oriented

PolarDB is disk-oriented as MySQL. Primary and replica nodes have buffer pool and they can access data and log in shared memory. Primary has the right to flush the page during normal operation. After redo in recovery, replicas will flush pages to disk. Primary node, after receiving the read view from new master, will also write pages to disk.

Storage Organization[05][04]

Log-structured

Each chunk server has a write ahead log (WAL). Any modification to a chunk server will be appended in log before updating the chunk. After primary node (read-write) modifies some pages, it will send logs to shared memory where replica could access. Replicas have log apply threads that modify their versions during redo.

System Architecture[04]

Shared-Disk

Within shared-disk, PolarDB has multiple data chunk servers which consist of chunks of data. Each chunk server has its own stand-alone non-volatile memory SSD disk. Compute nodes (database server) read and write to the disk via remote direct memory access (RDMA).

Citations

6 sources

PolarDB_Cloud-Native Relational Database_Database-Alibaba Cloud alibabacloud.com Accessed: 2026-06-04
兼容MySQL的云原生混合事务分析数据库-PolarDB MySQL企业版-云原生数据库 PolarDB-阿里云 aliyun.com Accessed: 2026-06-07
A Brief History of Development of Alibaba Cloud PolarDB - Alibaba Cloud Community alibabacloud.com Accessed: 2026-06-07
http://www.vldb.org/pvldb/vol11/p1849-cao.pdf vldb.org Dead — Check Archive Modified: 2018-07-31 Accessed: 2026-06-07
Events Archive - Percona percona.com Modified: 2026-06-07 Accessed: 2026-06-07
not found alibabacloud.com Accessed: 2026-06-07

Revision #12 Last Updated: 2022-06-29 10:25