Cloud BigTable is a distributed storage system used in Google, it can be classified as a non-relational database system. BigTable is designed mainly for scalability. It typically works on petabytes of data spread across thousands of machines. There is not much public information about the detail of BigTable, since it is proprietory to Google. The most authoritative information about it is its paper. An open source implementation of it based on its original paper is Apache HBase. Google has now provided BigTable as its cloud NoSQL database service. The documentation of that might be helpful, too.
BigTable was among the early attempts Google made to manage big data. Jeffrey Dean and Sanjay Ghemawat were involved in it. It is one of the three components Google built for managing big data (the other two are Google File System and MapReduce). These three components focus on different aspects of big data: Google File System is a reliable distributed file system that the other two build upon; MapReduce is a distributed data processing framework; BigTable is a distributed storage system. These three projects are very famous in distributed system. They all have their open source implementation.
BigTable only supports transactions on a single row. It does not support transactions spanning multiple rows
BigTable does not support relational data model. Instead, it provides users the ability to create column families in a table. Each table usually contains a small number of column families, which should be rarely changed (because the change of them involves metadata change). Inside each column family, there can be unlimited number of columns. Users can freely add or delete columns in a column family. Deleting of an entire column family is also supported. BigTable does not have any type information associated with a given column. It only treats data as strings of bytes.
BigTable uses physical logging. For performance consideration, all tablets on a tablet server write logs to the same log file.
BigTable provides clients with the following APIs: 1. Look Up (Read a Single Row) 2. Scan (Read a subset of rows) 3. Write 4. Delete 5. Customized Scripts (written in Sawzall language)
BigTable assumes an underlying reliable distributed file system (here is Google File System). The tablets are stored in Google File System, which is a disk-oriented file system. The most recently written records are stored in memtable, which is in memory. However, most of the data is stored on disk.
In BigTable, a table is split into multiple tablets, each of which is a subset of consecutive rows. A tablet is a unit of data distribution and load balancing. Different tablets of a table may be assigned to different tablet servers. A tablet is stored in the form of a log-structured merge tree (which they call memtable and SSTable). Furthermore, BigTable allows clients to create locality group. A locality group is a subset of columns in a table. BigTable will create a separate SSTable for each locality group, which will improve read performance of this locality group.