Teradata - Statistics
Teradata optimizer comes up with an execution strategy for every SQL query. This execution strategy is based on the statistics collected on the tables used within the SQL query. Statistics on the table is collected using COLLECT STATISTICS command. Optimizer requires environment information and data demographics to come up with optimal execution strategy.
- Number of Nodes, AMPs and CPUs
- Amount of memory
- Number of rows
- Row size
- Range of values in the table
- Number of rows per value
- Number of Nulls
There are three approaches to collect statistics on the table.
- Random AMP Sampling
- Full statistics collection
- Using SAMPLE option
COLLECT STATISTICS command is used to collect statistics on a table.
Following is the basic syntax to collect statistics on a table.
COLLECT [SUMMARY] STATISTICS INDEX (indexname) COLUMN (columnname) ON <tablename>;
The following example collects statistics on EmployeeNo column of Employee table.
COLLECT STATISTICS COLUMN(EmployeeNo) ON Employee;
When the above query is executed, it produces the following output.
*** Update completed. 2 rows changed. *** Total elapsed time was 1 second.
You can view the collected statistics using HELP STATISTICS command.
Following is the syntax to view the statistics collected.
HELP STATISTICS <tablename>;
Following is an example to view the statistics collected on Employee table.
HELP STATISTICS employee;
When the above query is executed, it produces the following result.
Date Time Unique Values Column Names -------- -------- -------------------- ----------------------- 16/01/01 08:07:04 5 * 16/01/01 07:24:16 3 DepartmentNo 16/01/01 08:07:04 5 EmployeeNo