Teradata - Data Protection

This chapter discusses the features available for data protection in Teradata.

Transient Journal

Teradata uses Transient Journal to protect data from transaction failures. Whenever any transactions are run, Transient journal keeps a copy of the before images of the affected rows until the transaction is successful or rolled back successfully. Then, the before images are discarded. Transient journal is kept in each AMPs. It is an automatic process and cannot be disabled.

Fallback

Fallback protects the table data by storing the second copy of rows of a table on another AMP called as Fallback AMP. If one AMP fails, then the fallback rows are accessed. With this, even if one AMP fails, data is still available through fallback AMP. Fallback option can be used at table creation or after table creation. Fallback ensures that the second copy of the rows of the table is always stored in another AMP to protect the data from AMP failure. However, fallback occupies twice the storage and I/O for Insert/Delete/Update.

Following diagram shows how fallback copy of the rows are stored in another AMP.

Down AMP Recovery Journal

The Down AMP recovery journal is activated when the AMP fails and the table is fallback protected. This journal keeps track of all the changes to the data of the failed AMP. The journal is activated on the remaining AMPs in the cluster. It is an automatic process and cannot be disabled. Once the failed AMP is live then the data from the Down AMP recovery journal is synchronized with the AMP. Once this is done, the journal is discarded.

Cliques

Clique is a mechanism used by Teradata to protect data from Node failures. A clique is nothing but a set of Teradata nodes that share a common set of Disk Arrays. When a node fails, then the vprocs from the failed node will migrate to other nodes in the clique and continue to access their disk arrays.

Hot Standby Node

Hot Standby Node is a node that does not participate in the production environment. If a node fails then the vprocs from the failed nodes will migrate to the hot standby node. Once the failed node is recovered it becomes the hot standby node. Hot Standby nodes are used to maintain the performance in case of node failures.

RAID

Redundant Array of Independent Disks (RAID) is a mechanism used to protect data from Disk Failures. Disk Array consists of a set of disks which are grouped as a logical unit. This unit may look like a single unit to the user but they may be spread across several disks.

RAID 1 is commonly used in Teradata. In RAID 1, each disk is associated with a mirror disk. Any changes to the data in primary disk is reflected in mirror copy also. If the primary disk fails, then the data from mirror disk can be accessed.