Data Warehousing - Process Managers
Data Warehouse Load Manager
This Component performs the operations required to extract and load process.
The size and complexity of load manager varies between specific solutions from data warehouse to data warehouse.
Load Manager Architecture
The load manager does the following functions.
Extract the data from source system.
Fast Load the extracted data into temporary data store.
Perform simple transformations into structure similar to the one in the data warehouse.
Extract Data from Source
The data is extracted from the operational databases or the external information providers. Gateways is the application programs that are used to extract data. It is supported by underlying DBMS and allows client program to generate SQL to be executed at a server. Open Database Connection( ODBC), Java Database Connection (JDBC), are examples of gateway.
In order to minimize the total load window the data need to be loaded into the warehouse in the fastest possible time.
The transformations affects the speed of data processing.
It is more effective to load the data into relational database prior to applying transformations and checks.
Gateway technology proves to be not suitable, since they tend not be performant when large data volumes are involved.
While loading it may be required to perform simple transformations. After this has been completed we are in position to do the complex checks. Suppose we are loading the EPOS sales transaction we need to perform the following checks.
Strip out all the columns that are not required within the warehouse.
Convert all the values to required data types.
Warehouse manager is responsible for the warehouse management process.
The warehouse manager consist of third party system software, C programs and shell scripts.
The size and complexity of warehouse manager varies between specific solutions.
Warehouse Manager Architecture
The warehouse manager includes the following.
The Controlling process
Stored procedures or C with SQL
Operations Performed by Warehouse Manager
Warehouse manager analyses the data to perform consistency and referential integrity checks.
Creates the indexes, business views, partition views against the base data.
Generates the new aggregations and also updates the existing aggregation
Generates the normalizations.
Warehouse manager Warehouse manager transforms and merge the source data into the temporary store into the published data warehouse.
Backup the data in the data warehouse.
Warehouse Manager archives the data that has reached the end of its captured life.
Note: Warehouse Manager also analyses query profiles to determine index and aggregations are appropriate.
Query Manager is responsible for directing the queries to the suitable tables.
By directing the queries to appropriate table the query request and response process is speed up.
Query Manager is responsible for scheduling the execution of the queries posed by the user.
Query Manager Architecture
Query Manager includes the following.
The query redirection via C tool or RDBMS.
Query Management tool.
Query Scheduling via C tool or RDBMS.
Query Scheduling via third party Software.
Operations Performed by Query Manager
Query Manager direct to the appropriate tables.
Query Manager schedule the execution of the queries posed by the end user.
Query Manager stores query profiles to allow the warehouse manager to determine which indexes and aggregations are appropriate.