Teradata is a relational database management system designed specifically for large warehouse applications, based on Massively Parallel Processing (MPP) architecture. It optimizes query performance through parallelism, using the AMP as a basic unit. A pre-defined number of AMPs is shared across a Teradata system to perform tasks including queries, dataload, backups, index builds, etc. This allows for Teradata to have linear scalability for growing workloads.
Teradata was founded based on research done in CalTech and incorporated in 1979 with the goal of creating a database computer that could beyond a terabyte of data. Teradata was acquired by NCR in 1991. In the following year Teradata became the first system to store over 1 terabyte of data and would be named a leader in commercial parallel processing. Teradata would launch its In 2007, Teradata would be split off from NRC becoming its own separate entity. In 2011, Teradata would acquire both Aprimo and Aster Data Systems Inc. beginning Teradata’s involvement in the big data market. From 2016 - 2017, Teradata has become available on both AWS and Azure.
Teradata by default offers a Start-of-Data checkpoint and an End-of-Data checkpoint. If the job failed before the End-of-Data checkpoint was taken, all work done after the Start-of-Data checkpoint will be repeated by the restarted job. For further protection, Teradata also provides interval checkpoints. Instead of automated checkpointing, the user can specify a time interval for checkpoint placement.
Compression is primarily used to reduce storage costs and to enhance system performance. Teradata allows for columns to be compressed resulting in more rows stored per data block and overall fewer data blocks. This allows for reduced disk I/O and improves query performance.
Teradata allows for the following possible compression options.
Multivalue Compression (MVC) MVC compresses repeating values in a column after specifying the value in the compression list in the column definition. When column data matches the specified value, the database stores the value once in the table header, regardless how many times it occurs as a value in the column. No decompression is necessary when accessing the data from memory. Algorithmic Compression This is generally used as an alternative to MVC when column values are mostly unique. Teradata includes several standard compression algorithms which can be used to compress many dataypes including: ARRAY, BYTE, VARBYTE, BLOB, CHARACTER, VARCHAR, CLOB, JSON, DATASET). Custom compression algorithms are also allowed to be used.
Row Compression Row compression is a lossless method. This type of compression stores a repeating column value set a single time, while non-repeating column values that belong to that set are stored as extensions of the base set.
Auto Compression This type of compression is used when creating a column-partitioned table or join index. Auto Compression chooses compression methods for the physical containers of a column-partition table or join index.