Teradata - Questions & Answers

Dear readers, these Teradata Interview Questions have been designed specially to get you acquainted with the nature of questions you may encounter during your interview for the subject of Teradata. As per my experience good interviewers hardly plan to ask any particular question during your interview, normally questions start with some basic concept of the subject and later they continue based on further discussion and what you answer −

Explain the architecture of Teradata.

Teradata Architecture consists of three components.

Parsing Engine − Parsing Engine receives the query from the user, parses it and prepares the execution plan.
BYNET − BYNET receives the execution plan from the Parsing Engine and dispatches to the appropriate AMP.
AMP − AMP is responsible for storing and retrieving rows. It stores the data in the virtual disk associated with it. In addition to this, AMP is responsible for lock management, space management, sorting and aggregation.

What is the difference between FastLoad and MultiLoad?

FastLoad	MultiLoad
Target table should be empty	Target table need not be empty
Only one table can be loaded using a single script	Can load/update up to 5 tables
Supports only CREATE/INSERT statement	Supports up to 20 DML statements in single script
Doesn’t support tables with RI, SI and Triggers	Supports tables with NUSI

Why is FastLoad unable to load duplicate records?

FastLoad provides restart capability through checkpoints. When the script is restarted from the last checkpoint, it is possible that the same rows may be sent again to the AMPs. That’s the reason FastLoad doesn’t support duplicates.

What is the difference between SET and MULTISET tables?

SET table does not allow duplicate records whereas MULTISET allows duplicate records.

How does SET table check for duplicate records?

For each row inserted, System checks if there is any record with the same row hash. If the table has UPI defined, then it will reject the record as duplicate. Otherwise, it will compare the entire record for duplicate. This will severely affect the system performance.

What is the efficient method for avoiding duplicate row checking for SET tables?

You can define either Unique Primary Index or Unique Secondary Index to avoid duplicate row checking.

What are the different ways to create tables?

Tables are created using CREATE TABLE statement. Tables can be created using

CREATE TABLE statement with column definition.
CREATE TABLE from an existing table.
CREATE TABLE statement with a SELECT statement.

How do you find duplicate records in a table?

Duplicate records can be identified using DISTINCT statement or GROUP BY statement.

SELECT DISTINCT column 1, column 2… 
FROM tablename;
  
OR
  
SELECT column 1, column 2,… 
FROM tablename 
GROUP BY column 1, column 2….;

What is the difference between primary index and primary key?

Primary keys are not mandatory in Teradata whereas Primary Index is mandatory.
Data distribution is based on Primary Index value.
Primary keys doesn’t accept NULLs whereas Primary Index accepts NULL values.
Primary keys are unique whereas Primary Index can be either unique (UPI) or non unique(NUPI).
Primary keys doesn’t change whereas Primary Indexes change.

What are all the different ways of accessing data in Teradata tables?

Data can be accessed in 3 different ways −

Through Primary Index
Through Secondary Index
Full Table Scan

How to identify the number of AMPs in the system?

It can be identified using query SELECT HASHAMP() + 1;

How to identify the number of rows in each AMP for a specific table?

The following query can be used for this purpose.

SELECT HASHMAP(HASHBUCKET(HASHROW(primaryindexvalue))), COUNT(*) 
FROM tablename GROUP BY 1;

What are the transaction modes available in Teradata?

Teradata supports two transaction modes.

Teradata
ANSI

Teradata mode is set using SET SESSION TRANSACTION BTET; ANSI mode is set using SET SESSION TRANSACTION ANSI;

How to execute transactions using Teradata?

Transactions can be executed using BT and ET statements.

How to access the data stored in Join Indexes?

Join Indexes cannot be directly accessed by the user. Only the optimizer can access them.

What will happen if the input file for a FastLoad script contains duplicate records?

The duplicate records will be rejected from loading the target tables and will be inserted into the UV table.

What is the use of FALLBACK?

FALLBACK is a protection mechanism used by Teradata to handle AMP failures. For each data row, another copy of the row is stored in a different AMP within a cluster. If any AMP fails, then the corresponding rows will be accessed using FALLBACK AMP.

How do you specify FALLBACK option for a table?

FALLBACK can be mentioned while table creation using CREATE TABLE statement or after the table is created using ALTER TABLE statement.

When will you get Spool Space error?

Spool space error will happen if the intermediate results of the query exceeds per AMP spool space limit set for the user who submitted the query.

What is the use of SLEEP command?

SLEEP command specifies the waiting time before Teradata attempts to establish the connection.

What is the use of TENACITY command?

TENACITY command specifies the total waiting time for Teradata to establish a new connection.

How do you release the lock on the table after FASTLOAD failure?

You can just keep the BEGIN LOADING and END LOADING statement and submit the FASTLOAD script. Other option is to drop the table and create the table again.

What is the use of caching in Teradata?

Caching in Teradata works with the source and remains in the same order, that is, it does not change frequently. The Cache is usually shared among applications. It is an added advantage of using Teradata.

What is RAID?

RAID is a protection mechanism to handle disk failure. It stands for Redundant Array of Independent Disks. RAID 1 is commonly used in Teradata.

What are the advantages and disadvantages of Secondary Index?

Secondary index provides alternate path to access the data. They are used to avoid Full Table Scan. However, secondary indexes require additional physical structure for maintaining sub tables and also require additional I/O since the sub table needs to be updated for each row.

What are the different lock types available in Teradata?

There are four different locks in Teradata − Exclusive, Write, Read, and Access.

What are the different lock levels available in Teradata?

Locks can be applied at three different levels − Database, Table, and Row.

How many values can be compressed in a table?

Using Multi Value Compression (MVC) you can compress up to 255 values including NULLs.

How FastLoad loads data into Teradata?

FastLoad loads the data in 64K blocks. There are 2 phases in FastLoad.

In Phase 1, it brings the data in 64K blocks and sends them to the target AMPs. Each AMP will then hash redistribute the rows to their target AMPs.
In Phase 2, rows are sorted by their row hash order and written into the target table.

What are all the different phases of MultiLoad?

MultiLoad import has five phases.

Phase 1 − Preliminary Phase – Performs basic setup activities.
Phase 2 − DML Transaction Phase – Verifies the syntax of DML statements and brings them to Teradata system.
Phase 3 − Acquisition Phase – Brings the input data into work tables and locks the table.
Phase 4 − Application Phase – Applies all DML operations.
Phase 5 − Cleanup Phase – Releases the table lock.

Which is faster way to delete data from Tables? DELETE FROM statement or MultiLoad DELETE

MULTILOAD DELETE is faster since it deletes the records in blocks. DELETE FROM will delete row by row.

What is the difference between Stored Procedure and Macro?

Stored procedure returns one or more values whereas Macros can return one or more rows. In addition to SQL, stored procedure may contain SPL statements.

Which is better way to load data into Teradata? FastLoad/BTEQ/MultiLoad?

Both FastLoad and MultiLoad load the data in 64K blocks whereas BTEQ will process one row at a time.

Which is better way to extract data from Teradata? BTEQ/FastExport?

FastExport exports the data in 64K blocks whereas BTEQ exports one row at a time.

What is TPT?

Teradata Parallel Transporter (TPT) is the utility to load/export data. It combines all the functionalities of FastLoad, MultiLoad, BTEQ, TPUMP and FastExport.

What is a permanent Journal in Teradata?

Permanent journals keeps track of data before or after applying the changes. This help to roll back or roll forward the table to a particular state. Permanent journals can be enabled at table level or database level.

Why Teradata architecture is called Shared Nothing Architecture?

In Teradata, each AMP is associated with a virtual disk. Only the AMP that owns the virtual disk can access the data within that virtual disk. This is called as Shared Nothing Architecture.

What is the advantage of Partition Primary Index?

If the query uses partitioned columns then it will result in partition elimination, which will greatly improve the performance.
Partition eliminates other partitions and accesses only the partitions that contain the data.
You can easily drop the old partitions and create new partitions.

If a Secondary Index is created on a table, will the size of the table increase? Why?

Yes. Secondary index requires sub-tables which require permanent space.

If a partitioned primary index is added to a table, will the size of the table increase? Why?

Yes. Whenever partitioned primary index is added, then each row occupies additional 2 or 8 bytes for the partition number.

How do you identify the 2^nd highest value from a table?

You can use RANK function on the specified column with descending order with Qualify = 2 condition.

How do you handle spool space error?

You can check the EXPLAIN plan of the query to identify the steps that consume more spool space and try to optimize the query. Filters can be applied to reduce the number of records being processed or you can split the large query into multiple smaller queries.

What are the confidence levels in Teradata?

When the EXPLAIN command is used against the query, it specifies the confidence of the optimizer to retrieve the records.

There are three confidence levels in Teradata: High Confidence, Medium Confidence, and Low Confidence.

What is the difference between NUSI and Full table scan?

Both NUSI and Full Table Scan (FTS) will access all the AMPs but FTS will access all the blocks within the AMP whereas NUSI will access the blocks only if the sub-table contains the qualifying rows.

How do you skip first few records from the input file while loading into Teradata?

In BTEQ mode, SKIP command can be used to skip the records.

Which data type is better for storing employee Age information?

BYTEINT. It occupies only one byte and can store values up to +127.

Specify number of AMPs accessed by each method.

Through Unique Primary Index – 1 AMP
Through Non Unique Primary Index – 1 AMP
Through Unique Secondary Index – 2 AMPs
Through Non Unique Secondary Index – All AMPs

What is a Clique?

Clique is a protection mechanism to handle Node failures. It is a group of nodes. When a node within a clique fails, then the vprocs (Parsing Engine and AMP) will migrate to other nodes and continue to perform read/write operations on their virtual disks.

What are the protection mechanisms available in Teradata?

Teradata provides different levels of protection mechanism.

Transient Journal − To handle Transaction failure.
Fallback − To handle AMP failure.
Cliques − To handle Node failure.
RAID − To handle Disk failure.
Hot standby Node − To handle Node failure without affecting performance and restart.

What is the use of ACTIVITYCOUNT?

ACTIVITYCOUNT gives the number of rows affected by the previous SQL query in BTEQ. If the ACTIVITYCOUNT statement follows an insert statement, it returns the number of rows inserted. If the ACTIVITYCOUNT statement follows select statement, it returns the number of rows selected.

What is Next?

Further you can go through your past assignments you have done with the subject and make sure you are able to speak confidently on them. If you are fresher then interviewer does not expect you will answer very complex questions, rather you have to make your basics concepts very strong.

Second it really doesn't matter much if you could not answer few questions but it matters that whatever you answered, you must have answered with confidence. So just feel confident during your interview. We at tutorialspoint wish you best luck to have a good interviewer and all the very best for your future endeavor. Cheers :-)