Teradata - Explain
EXPLAIN command returns the execution plan of parsing engine in English. It can be used with any SQL statement except on another EXPLAIN command. When a query is preceded with EXPLAIN command, the execution plan of the Parsing Engine is returned to the user instead of AMPs.
Examples of EXPLAIN
Consider the table Employee with the following definition.
CREATE SET TABLE EMPLOYEE,FALLBACK ( EmployeeNo INTEGER, FirstName VARCHAR(30), LastName VARCHAR(30), DOB DATE FORMAT 'YYYY-MM-DD', JoinedDate DATE FORMAT 'YYYY-MM-DD', DepartmentNo BYTEINT ) UNIQUE PRIMARY INDEX ( EmployeeNo );
Some examples of EXPLAIN plan are given below.
Full Table Scan (FTS)
When no conditions are specified in the SELECT statement, then the optimizer may choose to use Full Table Scan where each and every row of the table is accessed.
Following is a sample query where the optimizer may choose FTS.
EXPLAIN SELECT * FROM employee;
When the above query is executed, it produces the following output. As can be seen the optimizer chooses to access all AMPs and all rows within the AMP.
1) First, we lock a distinct TDUSER."pseudo table" for read on a RowHash to prevent global deadlock for TDUSER.employee. 2) Next, we lock TDUSER.employee for read. 3) We do an all-AMPs RETRIEVE step from TDUSER.employee by way of an all-rows scan with no residual conditions into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 2 rows (116 bytes). The estimated time for this step is 0.03 seconds. 4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. → The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.03 seconds.
Unique Primary Index
When the rows are accessed using Unique Primary Index, then it is one AMP operation.
EXPLAIN SELECT * FROM employee WHERE EmployeeNo = 101;
When the above query is executed, it produces the following output. As can be seen it is a single-AMP retrieval and the optimizer is using the unique primary index to access the row.
1) First, we do a single-AMP RETRIEVE step from TDUSER.employee by way of the unique primary index "TDUSER.employee.EmployeeNo = 101" with no residual conditions. The estimated time for this step is 0.01 seconds. → The row is sent directly back to the user as the result of statement 1. The total estimated time is 0.01 seconds.
Unique Secondary Index
When the rows are accessed using Unique Secondary Index, it’s a two amp operation.
Consider the table Salary with the following definition.
CREATE SET TABLE SALARY,FALLBACK ( EmployeeNo INTEGER, Gross INTEGER, Deduction INTEGER, NetPay INTEGER ) PRIMARY INDEX ( EmployeeNo ) UNIQUE INDEX (EmployeeNo);
Consider the following SELECT statement.
EXPLAIN SELECT * FROM Salary WHERE EmployeeNo = 101;
When the above query is executed, it produces the following output. As can be seen the optimizer retrieves the row in two amp operation using unique secondary index.
1) First, we do a two-AMP RETRIEVE step from TDUSER.Salary by way of unique index # 4 "TDUSER.Salary.EmployeeNo = 101" with no residual conditions. The estimated time for this step is 0.01 seconds. → The row is sent directly back to the user as the result of statement 1. The total estimated time is 0.01 seconds.
Following is the list of terms commonly seen in EXPLAIN plan.
... (Last Use) …
A spool file is no longer needed and will be released when this step completes.
... with no residual conditions …
All applicable conditions have been applied to the rows.
... END TRANSACTION …
Transaction locks are released, and changes are committed.
... eliminating duplicate rows ...
Duplicate rows only exist in spool files, not set tables. Doing a DISTINCT operation.
... by way of a traversal of index #n extracting row ids only …
A spool file is built containing the Row IDs found in a secondary index (index #n)
... we do a SMS (set manipulation step) …
Combining rows using a UNION, MINUS, or INTERSECT operator.
... which is redistributed by hash code to all AMPs.
Redistributing data in preparation for a join.
... which is duplicated on all AMPs.
Duplicating data from the smaller table (in terms of SPOOL) in preparation for a join.
... (one_AMP) or (group_AMPs)
Indicates one AMP or subset of AMPs will be used instead of all AMPs.