Teradata - JOIN Index
JOIN INDEX is a materialized view. Its definition is permanently stored and the data is updated whenever the base tables referred in the join index is updated. JOIN INDEX may contain one or more tables and also contain pre-aggregated data. Join indexes are mainly used for improving the performance.
There are different types of join indexes available.
- Single Table Join Index (STJI)
- Multi Table Join Index (MTJI)
- Aggregated Join Index (AJI)
Single Table Join Index
Single Table Join index allows to partition a large table based on the different primary index columns than the one from the base table.
Following is the syntax of a JOIN INDEX.
CREATE JOIN INDEX <index name> AS <SELECT Query> <Index Definition>;
Consider the following Employee and Salary tables.
CREATE SET TABLE EMPLOYEE,FALLBACK ( EmployeeNo INTEGER, FirstName VARCHAR(30) , LastName VARCHAR(30) , DOB DATE FORMAT 'YYYY-MM-DD', JoinedDate DATE FORMAT 'YYYY-MM-DD', DepartmentNo BYTEINT ) UNIQUE PRIMARY INDEX ( EmployeeNo ); CREATE SET TABLE SALARY,FALLBACK ( EmployeeNo INTEGER, Gross INTEGER, Deduction INTEGER, NetPay INTEGER ) PRIMARY INDEX ( EmployeeNo ) UNIQUE INDEX (EmployeeNo);
Following is an example that creates a Join index named Employee_JI on Employee table.
CREATE JOIN INDEX Employee_JI AS SELECT EmployeeNo,FirstName,LastName, BirthDate,JoinedDate,DepartmentNo FROM Employee PRIMARY INDEX(FirstName);
If the user submits a query with a WHERE clause on EmployeeNo, then the system will query the Employee table using the unique primary index. If the user queries the employee table using employee_name, then the system may access the join index Employee_JI using employee_name. The rows of the join index are hashed on employee_name column. If the join index is not defined and the employee_name is not defined as secondary index, then the system will perform full table scan to access the rows which is time consuming.
You can run the following EXPLAIN plan and verify the optimizer plan. In the following example you can see that the optimizer is using the Join Index instead of base Employee table when the table queries using the Employee_Name column.
EXPLAIN SELECT * FROM EMPLOYEE WHERE FirstName='Mike'; *** Help information returned. 8 rows. *** Total elapsed time was 1 second. Explanation ------------------------------------------------------------------------ 1) First, we do a single-AMP RETRIEVE step from EMPLOYEE_JI by way of the primary index "EMPLOYEE_JI.FirstName = 'Mike'" with no residual conditions into Spool 1 (one-amp), which is built locally on that AMP. The size of Spool 1 is estimated with low confidence to be 2 rows (232 bytes). The estimated time for this step is 0.02 seconds. → The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.02 seconds.
Multi Table Join Index
A multi-table join index is created by joining more than one table. Multi-table join index can be used to store the result set of frequently joined tables to improve the performance.
The following example creates a JOIN INDEX named Employee_Salary_JI by joining Employee and Salary tables.
CREATE JOIN INDEX Employee_Salary_JI AS SELECT a.EmployeeNo,a.FirstName,a.LastName, a.BirthDate,a.JoinedDate,a.DepartmentNo,b.Gross,b.Deduction,b.NetPay FROM Employee a INNER JOIN Salary b ON(a.EmployeeNo = b.EmployeeNo) PRIMARY INDEX(FirstName);
Whenever the base tables Employee or Salary are updated, then the Join index Employee_Salary_JI is also automatically updated. If you are running a query joining Employee and Salary tables, then the optimizer may choose to access the data from Employee_Salary_JI directly instead of joining the tables. EXPLAIN plan on the query can be used to verify if the optimizer will choose the base table or Join index.
Aggregate Join Index
If a table is consistently aggregated on certain columns, then aggregate join index can be defined on the table to improve the performance. One limitation of aggregate join index is that it supports only SUM and COUNT functions.
In the following example Employee and Salary is joined to identify the total salary per Department.
CREATE JOIN INDEX Employee_Salary_JI AS SELECT a.DepartmentNo,SUM(b.NetPay) AS TotalPay FROM Employee a INNER JOIN Salary b ON(a.EmployeeNo = b.EmployeeNo) GROUP BY a.DepartmentNo Primary Index(DepartmentNo);