Teradata

Teradata is a relational database management system designed specifically for large warehouse applications, based on Massively Parallel Processing (MPP) architecture. It optimizes query performance through parallelism, using the AMP as a basic unit. A pre-defined number of AMPs is shared across a Teradata system to perform tasks including queries, dataload, backups, index builds, etc. This allows for Teradata to have linear scalability for growing workloads.

History

Teradata was founded based on research done in CalTech and incorporated in 1979 with the goal of creating a database computer that could beyond a terabyte of data. Teradata was acquired by NCR in 1991. In the following year Teradata became the first system to store over 1 terabyte of data and would be named a leader in commercial parallel processing. Teradata would launch its In 2007, Teradata would be split off from NRC becoming its own separate entity. In 2011, Teradata would acquire both Aprimo and Aster Data Systems Inc. beginning Teradata’s involvement in the big data market. From 2016 - 2017, Teradata has become available on both AWS and Azure.

Concurrency Control

Deterministic Concurrency Control

For concurrency control, Teradata utilizes the proxy lock strategy. The systems requires that each request has a pseudo lock on it before obtaining an actual lock. For each table, proxy locking defines an AMP which manages the actual locks. Lock requests are to be queued up on this AMP. It is noted that not all deadlocks can be prevented with this strategy. By default Teradata checks for deadlocks globally every four minutes and locally every thirty seconds. Teradata doesn’t adhere to two-phase locking.

Checkpoints

Consistent

Teradata by default offers a Start-of-Data checkpoint and an End-of-Data checkpoint. If the job failed before the End-of-Data checkpoint was taken, all work done after the Start-of-Data checkpoint will be repeated by the restarted job. For further protection, Teradata also provides interval checkpoints. Instead of automated checkpointing, the user can specify a time interval for checkpoint placement.

Data Model

Relational

Teradata supports a relational model. It also supports document store, Graph DBMS, and time series DBMS as secondary models.

Compression

Delta Encoding

Compression is primarily used to reduce storage costs and to enhance system performance. Teradata allows for columns to be compressed resulting in more rows stored per data block and overall fewer data blocks. This allows for reduced disk I/O and improves query performance.

Teradata allows for the following possible compression options.

Multivalue Compression (MVC)
MVC compresses repeating values in a column after specifying the value in the compression list in the column definition. When column data matches the specified value, the database stores the value once in the table header, regardless how many times it occurs as a value in the column. No decompression is necessary when accessing the data from memory.

Algorithmic Compression
This is generally used as an alternative to MVC when column values are mostly unique. Teradata includes several standard compression algorithms which can be used to compress many dataypes including: ARRAY, BYTE, VARBYTE, BLOB, CHARACTER, VARCHAR, CLOB, JSON, DATASET). Custom compression algorithms are also allowed to be used.

Row Compression
Row compression is a lossless method. This type of compression stores a repeating column value set a single time, while non-repeating column values that belong to that set are stored as extensions of the base set.

Auto Compression
This type of compression is used when creating a column-partitioned table or join index. Auto Compression chooses compression methods for the physical containers of a column-partition table or join index.

Teradata Logo
Website

http://www.teradata.com/

Developer

TeraData

Country of Origin

US

Start Year

1979

Project Type

Commercial

Licenses

Proprietary

Wikipedia

https://en.wikipedia.org/wiki/Teradata