Cloud BigTable

Viewing Revision #6 from 2022-07-01 12:22 View Current

Cloud BigTable is a distributed storage system used in Google, it can be classified as a non-relational database system. BigTable is designed mainly for scalability. It typically works on petabytes of data spread across thousands of machines. [03][04][05][01]

Logo Versions

Website: https://cloud.google.com/bigtable[01]
Developer: Google LLC
Country of Origin: US
Start Year: 2005 [14]
Former Name: BigTable
Project Type: Commercial
Supported Languages: C++
Operating System: Linux
Wikipedia: https://en.wikipedia.org/wiki/Bigtable[02]

There is not much public information about the detail of BigTable, since it is proprietory to Google. The most authoritative information about it is its paper[1]. An open source implementation of it based on its original paper is Apache HBase[2].

Google has now provided BigTable as its cloud NoSQL database service[3]. The documentation of that[4] might be helpful, too.

Logo Versions

Website: https://cloud.google.com/bigtable[01]
Developer: Google LLC
Country of Origin: US
Start Year: 2005 [14]
Former Name: BigTable
Project Type: Commercial
Supported Languages: C++
Operating System: Linux
Wikipedia: https://en.wikipedia.org/wiki/Bigtable[02]

Derivative Systems

LevelDB

Heroic

Cloud Spanner

Embeddings

JanusGraph

Cloud BigTable

Viewing Revision #6 from 2022-07-01 12:22 View Current

Google has now provided BigTable as its cloud NoSQL database service[3]. The documentation of that[4] might be helpful, too.[03][04][05][01]

History[06][07][08][05][09]

BigTable was among the early attempts Google made to manage big data. Jeffrey Dean and Sanjay Ghemawat were involved in it. It is one of the three components Google built for managing big data (the other two are Google File System[1] and MapReduce[2]).

These three components focus on different aspects of big data: Google File System is a reliable distributed file system that the other two build upon; MapReduce is a distributed data processing framework; BigTable is a distributed storage system.

These three projects are very famous in distributed system. They all have their open source implementation.[3][4][5]

Checkpoints[04]

In BigTable, SSTables are immutable and persistent in GFS. Therefore, only the writes to memtable will generate logs. Although BigTable does not do checkpointing explicitly, it has something that is in effect doing a checkpoint: When a memtable gets too large, the system will do a compaction on it and transform it into an SSTable[1]. This is effectively a checkpointing on this memtable.

Concurrency Control[10]

Not Supported

BigTable only supports transactions on a single row[1]. It does not support transactions spanning multiple rows

Data Model[10]

Column Family / Wide-Column

BigTable does not support relational data model. Instead, it provides users the ability to create column families in a table.

Each table usually contains a small number of column families, which should be rarely changed (because the change of them involves metadata change). Inside each column family, there can be unlimited number of columns. Users can freely add or delete columns in a column family. Deleting of an entire column family is also supported.

BigTable does not have any type information associated with a given column. It only treats data as strings of bytes.

Indexes

Not Supported

Joins

Not Supported

Logging[04]

Physical Logging

BigTable uses physical logging. For performance consideration, all tablets on a tablet server write logs to the same log file[1].

Query Compilation

Not Supported

Query Interface[11]

Custom API

BigTable provides clients with the following APIs: 1. Look Up (Read a Single Row) 2. Scan (Read a subset of rows) 3. Write 4. Delete 5. Customized Scripts (written in Sawzall language)

Storage Architecture[12]

Disk-oriented

BigTable assumes an underlying reliable distributed file system (here is Google File System). The tablets are stored in Google File System, which is a disk-oriented file system. The most recently written records are stored in memtable, which is in memory. However, most of the data is stored on disk.

Storage Model[13][04]

Custom

In BigTable, a table is split into multiple tablets, each of which is a subset of consecutive rows[1]. A tablet is a unit of data distribution and load balancing. Different tablets of a table may be assigned to different tablet servers. A tablet is stored in the form of a log-structured merge tree[2] (which they call memtable and SSTable).

Furthermore, BigTable allows clients to create locality group[3]. A locality group is a subset of columns in a table. BigTable will create a separate SSTable for each locality group, which will improve read performance of this locality group.

Stored Procedures

Not Supported

Views

Not Supported

Derivative Systems

LevelDB

Heroic

Cloud Spanner

Embeddings

JanusGraph

Citations

14 sources

Bigtable: fast, flexible NoSQL | Google Cloud google.com Accessed: 2026-07-15
Bigtable - Wikipedia wikipedia.org Modified: 2026-01-29 Accessed: 2026-06-04
Bigtable documentation | Google Cloud Documentation google.com Modified: 2026-06-11 Accessed: 2026-06-14
http://static.googleusercontent.com/media/research.google.com/zh-CN//archive/bigtable-osdi06.pdf googleusercontent.com
Apache HBase apache.org Modified: 2026-07-16 Accessed: 2026-07-16
http://static.googleusercontent.com/media/research.google.com/zh-CN//archive/gfs-sosp2003.pdf googleusercontent.com
http://static.googleusercontent.com/media/research.google.com/zh-CN//archive/mapreduce-osdi04.pdf googleusercontent.com
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html apache.org
https://hadoop.apache.org apache.org Dead — Check Archive Accessed: 2026-06-04
http://static.googleusercontent.com/media/research.google.com/zh-CN/archive/bigtable-osdi06.pdf section 2 googleusercontent.com Dead — Check Archive Accessed: 2026-06-14
http://static.googleusercontent.com/media/research.google.com/zh-CN/archive/bigtable-osdi06.pdf section 3 googleusercontent.com Dead — Check Archive Accessed: 2026-06-14
http://static.googleusercontent.com/media/research.google.com/zh-CN/archive/bigtable-osdi06.pdf section 5.3 googleusercontent.com Dead — Check Archive Accessed: 2026-06-14
https://doi.org/10.1007/s002360050048 doi.org
http://static.googleusercontent.com/media/research.google.com/zh-CN/archive/bigtable-osdi06.pdf section 11 googleusercontent.com Dead — Check Archive Accessed: 2026-06-14

Revision #6 Last Updated: 2022-07-01 08:22