ETL Testing – Scenarios


ETL Test Scenarios are used to validate an ETL Testing Process. The following table explains some of the most common scenarios and test-cases that are used by ETL testers.

Test Scenarios Test-Cases

Structure Validation

It involves validating the source and the target table structure as per the mapping document.

Data type should be validated in the source and the target systems.

The length of data types in the source and the target system should be same.

Data field types and their format should be same in the source and the target system.

Validating the column names in the target system.

Validating Mapping document

It involves validating the mapping document to ensure all the information has been provided. The mapping document should have change log, maintain data types, length, transformation rules, etc.

Validate Constraints

It involves validating the constraints and ensuring that they are applied on the expected tables.

Data Consistency check

It involves checking the misuse of integrity constraints like Foreign Key.

The length and data type of an attribute may vary in different tables, though their definition remains same at the semantic layer.

Data Completeness Validation

It involves checking if all the data is loaded to the target system from the source system.

Counting the number of records in the source and the target systems.

Boundary value analysis.

Validating the unique values of primary keys.

Data Correctness Validation

It involves validating the values of data in the target system.

Misspelled or inaccurate data is found in table.

Null, Not Unique data is stored when you disable integrity constraint at the time of import.

Data Transform validation

It involves creating a spreadsheet of scenarios for input values and expected results and then validating with end-users.

Validating parent-child relationship in the data by creating scenarios.

Using data profiling to compare the range of values in each field.

Validating if the data types in the warehouse are same as mentioned in the data model.

Data Quality Validation

It involves performing number check, date check, precision check, data check, Null check, etc.

Example − Date format should be same for all the values.

Null Validation

It involves checking the Null values where Not Null is mentioned for that field.

Duplicate Validation

It involves validating duplicate values in the target system when data is coming from multiple columns from the source system.

Validating primary keys and other columns if there is any duplicate values as per the business requirement.

Date Validation check

Validating date field for various actions performed in ETL process.

Common test-cases to perform Date validation −

  • From_Date should not greater than To_Date

  • Format of date values should be proper.

  • Date values should not have any junk values or null values

Full Data Validation Minus Query

It involves validating full data set in the source and the target tables by using minus query.

  • You need to perform both source minus target and target minus source.

  • If the minus query returns a value, that should be considered as mismatching rows.

  • You need to match the rows in source and target using the Intersect statement.

  • The count returned by Intersect should match with the individual counts of source and target tables.

  • If the minus query returns no rows and the count intersect is less than the source count or the target table count, then the table holds duplicate rows.

Other Test Scenarios

Other Test scenarios can be to verify that the extraction process did not extract duplicate data from the source system.

The testing team will maintain a list of SQL statements that are run to validate that no duplicate data have been extracted from the source systems.

Data Cleaning

Unwanted data should be removed before loading the data to the staging area.