- SAP HANA Tutorial
- SAP HANA - Home
- SAP HANA Introduction
- SAP HANA - Overview
- In-Memory Computing Engine
- SAP HANA - Studio
- Studio Administration View
- SAP HANA - System Monitor
- SAP HANA - Information Modeler
- SAP HANA - Core Architecture
- SAP HANA Modeling
- SAP HANA - Modeling
- SAP HANA - Data Warehouse
- SAP HANA - Tables
- SAP HANA - Packages
- SAP HANA - Attribute View
- SAP HANA - Analytic View
- SAP HANA - Calculation View
- SAP HANA - Analytic Privileges
- SAP HANA - Information Composer
- SAP HANA - Export and Import
- SAP HANA Reporting
- SAP HANA - Reporting View
- Bi 4.0 Connectivity to HANA Views
- SAP HANA - Crystal Reports
- SAP HANA - Excel Integration
- SAP HANA Security
- SAP HANA - Security Overview
- User Administration & Management
- SAP HANA - Authentications
- SAP HANA - Authorization methods
- SAP HANA - License Management
- SAP HANA - Auditing
- SAP HANA Data Replication
- SAP HANA - Data Replication Overview
- SAP HANA - ETL Based Replication
- SAP HANA - Log Based Replication
- SAP HANA - DXC Method
- SAP HANA - CTL Method
- SAP HANA - MDX Provider
- SAP HANA Monitoring
- SAP HANA - Monitoring and Alerting
- SAP HANA - Persistent Layer
- SAP HANA - Backup & Recovery
- SAP HANA - High Availability
- SAP HANA - Log Configuration
- SAP HANA SQL
- SAP HANA - SQL Overview
- SAP HANA - Data Types
- SAP HANA - SQL Operators
- SAP HANA - SQL Functions
- SAP HANA - SQL Expressions
- SAP HANA - SQL Stored Procedures
- SAP HANA - SQL Sequences
- SAP HANA - SQL Triggers
- SAP HANA - SQL Synonym
- SAP HANA - SQL Explain Plans
- SAP HANA - SQL Data Profiling
- SAP HANA - SQL Script
- SAP HANA Useful Resources
- SAP HANA - Questions and Answers
- SAP HANA - Quick Guide
- SAP HANA - Useful Resources
- SAP HANA - Discussion
SAP HANA - SQL Data Profiling
SQL Data Profiling task is used to understand and analyze data from multiple data sources. It is used to remove incorrect, incomplete data and prevent data quality problems before they are loaded in Data warehouse.
Here are the benefits of SQL Data Profiling tasks −
It helps is analyzing source data more effectively.
It helps in understanding the source data better.
It remove incorrect, incomplete data and improve data quality before it is loaded into Data warehouse.
It is used with Extraction, Transformation and Loading task.
The Data Profiling task checks profiles that helps to understand a data source and identify problems in the data that has to be fixed.
You can use the Data Profiling task inside an Integration Services package to profile data that is stored in SQL Server and to identify potential problems with data quality.
Note − Data Profiling Task works only with SQL Server data sources and does not support any other file based or third party data sources.
Access Requirement
To run a package contains Data Profiling task, user account must have read/write permissions with CREATE TABLE permissions on the tempdb database.
Data Profiler Viewer
Data Profile Viewer is used to review the profiler output. The Data Profile Viewer also supports drilldown capability to help you understand data quality issues that are identified in the profile output. This drill down capability sends live queries to the original data source.
Data Profiling Task Setup and Reviewing
Setting up the Data Profiling Task
It involves execution of a package that contains Data Profiling task to compute the profiles. The task saves the output in XML format to a file or a package variable.
Reviewing the Profiles
To view the data profiles, send the output to a file and then use the Data Profile Viewer. This viewer is a stand-alone utility that displays the profile output in both summary and detail format with optional drilldown capability.
Data Profiling − Configuration Options
The Data Profiling task has these convenient configuration options −
Wildcard columns
While configuring a profile request, the task accepts ‘*’ wildcard in place of a column name. This simplifies the configuration and makes it easier to discover the characteristics of unfamiliar data. When the task runs, the task profiles every column that has an appropriate data type.
Quick Profile
You can select Quick Profile to configure the task quickly. A Quick Profile profiles a table or view by using all the default profiles and settings.
The Data Profiling Task can compute eight different data profiles. Five of these profiles can check individual columns and the remaining three analyze- multiple columns or relationships between columns.
Data Profiling − Task Outputs
The Data Profiling task outputs the selected profiles into XML format that is structured like DataProfile.xsd schema.
You can save a local copy of the schema and view the local copy of the schema in Microsoft Visual Studio or another schema editor, in an XML editor or in a text editor such as Notepad.