HBase

Viewing Revision #3 from 2018-04-21 02:28 View Current

HBase is an open source, distributed, non-relational, scalable big data store that runs on top of Hadoop Distributed Filesystem. Hbase is suitable for storing large quantities of data, but it lacks many of the features that relational database management systems usually have, such as column types, secondary indexes, advanced query languages, etc. HBase stores the data in rows and columns. A row is referenced by a row key, and columns are grouped into "column families". HBase is written in Java, and is supported by Apache Software Foundation.[01]

Logo Versions

Website: https://hbase.apache.org/[01]
Start Year: 2007 [21]
Supported Languages: Java
Operating System: All OS with Java VM

Database Entry

HBase

Viewing Revision #3 from 2018-04-21 02:28 View Current

History[02]

HBase was initially a project by the company Powerset, a San Francisco-based search and natural language company. Microsoft acquired Powerset in 2008.

Checkpoints[03]

Non-Blocking

HBase Snapshots allows the users to make checkpoints of a table with little impact on RegionServers. Creating snapshots does not block reads and writes, but for each table only one snapshot can be created at a time.

Concurrency Control[04]

Multi-version Concurrency Control (MVCC)

HBase guarantees ACID semantics per-row. HBase uses a form of Multiversion Concurrency Control (MVCC) to avoid row locks for read operations. Write operations still need to acquire row locks.

Data Model[05]

Column Family / Wide-Column

An HBase table consists of rows and columns. Rows are referenced by row keys which are raw byte arrays and are sorted by row key. The sort is byte-ordered. Each row contains columns. A column's content is also an uninterpreted array of bytes. All columns belong to a column family.

Foreign Keys[06]

HBase does not directly support refential integrity. Users can use a coprocessor to enforce foreign keys.

Indexes[07]

B+Tree

For each table, HBase only provide an B+Tree like index on row keys. HBase does not natively support secondary indexes. Users can use filters for querying on non-rowkey columns. There are some techniques to create another table which can be used as a secondary index.

Isolation Levels[08][09]

Read Uncommitted Read Committed

HBase only provide "read committed" isolation level. Users can downgrade the isolation level to "read uncommitted" by modifying the source code.

Joins[10]

Not Supported

HBase does not support join operations. Users can implement joins in their application code.

Logging[11][12]

Logical Logging

HBase's write-ahead-log is named HLog.

Query Compilation

Not Supported

Query Execution[13]

Tuple-at-a-Time Model

Query Interface[14]

Custom API

HBase does not provide native support for SQL. Unlike RDBMS, HBase has four primary operations: Get, Put, Scan and Delete. It also has some DDL operations, e.g., Create. HBase provides a shell which users can fire queries from. Users can specify table name, column names and apply filters in their query. HBase also offers Java Client API and Thrift/REST API. Some third-party drivers are also available for other programming languages.

Storage Architecture[01]

Disk-oriented

HBase leverages HDFS as the backend storage. Currently HDFS is disk-oriented.

Storage Model[15][16]

Decomposition Storage Model (Columnar)

HBase is schema-less column-oriented datastore.

Stored Procedures[17]

Not Supported

Stored procedures are not directly supported in HBase. But users can use coprocessors to resemble store procedures.

System Architecture[18][19]

Shared-Nothing

HBase is organized as a cluster of HBase nodes that complies with the master-slave architecture. There are two types of nodes: a master node, and one or more slave nodes called RegionServers. RegionServers serve data for reads and writes. An HBase Region is a subset of an HBase table that has a continuous range of sorted rowkeys. Region assignment and DDL operations are handled by the HBase Master. HBase is built on top of Hadoop. The Hadoop DataNodes store the data that RegionServers are managing. The HDFS Zookeeper maintains the server status in the cluster.

Views[20]

Not Supported

HBase does not provide views. But users can write MapReduce programs to approximate views. Apache Phoenix, a SQL interface for HBase, provides support for views.

Compatible Systems

MapR-DB

Drill

Derivative Systems

Axibase

Trafodion

Embeddings

Titan

Splice Machine

OpenTSDB

View All (5)

Citations

21 sources

Apache HBase apache.org Modified: 2026-07-16 Accessed: 2026-07-16
Powerset team resumes HBase contributions | Microsoft Learn microsoft.com Modified: 2024-09-24 Accessed: 2026-06-14
Apache HBase® Reference Guide apache.org Modified: 2026-03-18 Accessed: 2026-06-14
Apache HBase Internals: Locking and Multiversion Concurrency Control | Blogs Archive apache.org Modified: 2026-04-22 Accessed: 2026-06-14
Apache HBase® Reference Guide apache.org Modified: 2026-03-18 Accessed: 2026-06-14
Apache HBase® Reference Guide apache.org Modified: 2026-03-18 Accessed: 2026-06-14
Apache HBase® Reference Guide apache.org Modified: 2026-03-18 Accessed: 2026-06-14
ACID Semantics - Apache HBase apache.org Modified: 2026-06-11 Accessed: 2026-06-14
https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tree apache.org Accessed: 2026-06-14
Apache HBase® Reference Guide apache.org Modified: 2026-03-18 Accessed: 2026-06-14
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/wal/HLog.html apache.org Dead — Check Archive Modified: 2026-06-04 Accessed: 2026-06-14
Lineland: HBase Architecture 101 - Write-ahead-Log larsgeorge.com Modified: 2026-05-28 Accessed: 2026-06-14
Apache HBase® Reference Guide apache.org Modified: 2026-03-18 Accessed: 2026-06-08
Apache HBase® Reference Guide apache.org Modified: 2026-03-18 Accessed: 2026-06-14
https://link.springer.com/chapter/10.1007/978-3-642-54341-8_4 springer.com Accessed: 2026-06-14
https://netwoven.com/data-engineering-and-analytics/data-engineering/hbase-overview-of-architecture-and-data-model netwoven.com Dead — Check Archive Accessed: 2026-06-05
Coprocessor Introduction | Blogs Archive apache.org Modified: 2026-04-22 Accessed: 2026-06-14
https://www.mapr.com/blog/in-depth-look-hbase-architecture mapr.com Dead — Check Archive Accessed: 2026-06-14
Apache HBase® Reference Guide apache.org Modified: 2026-03-18 Accessed: 2026-06-14
https://phoenix.apache.org/docs/features/views apache.org Dead — Check Archive Accessed: 2026-06-05
https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=shortlog;h=refs/heads/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379;pg=10 apache.org Accessed: 2026-06-14

Revision #3 Last Updated: 2018-04-20 22:28