What is a Three-tier Data Warehouse Architecture?

Data MiningDatabaseData Structure

Data Warehouses usually have a three-level (tier) architecture that involves −

  • The bottom tier is a warehouse database server that is relatively always a relational database system. Back-end tools and utilities are used to feed records into the bottom tier from operational databases or other external sources (including user profile data supported by external consultants).

These tools and utilities implement data extraction, cleaning, and transformation (e.g., to merge the same data from multiple sources into a unified format), and load and refresh functions to update the data warehouse. The data are extracted using application program interfaces referred to as gateways.

A gateway is supported by the underlying DBMS and allows client programs to generate SQL code to be executed at a server. Examples of gateways involve ODBC (Open Database Connection) and OLEDB (Open-Linking and Embedding for Databases) by Microsoft and JDBC (Java Database Connection). This tier also includes a metadata repository, which saves data about the data warehouse and its contents.

  • The middle tier is an OLAP server that is generally executed using either a relational OLAP (ROLAP) model, that is, an extended relational DBMS that maps operations on multidimensional data to standard relational operations, or a multidimensional OLAP (MOLAP) model, namely, a special-purpose server that directly performs multidimensional data and operations.

  • The top tier is a front-end client layer. It includes query and reporting tools, analysis tools, and/or data mining tools (e.g., trend analysis, prediction, etc.).

From the architecture point of view, there are three data warehouse models such as the enterprise warehouse, the data mart, and the virtual warehouse.

Enterprise warehouse − An enterprise warehouse collects all of the data about subjects spanning the whole organization. It supports corporate-wide data integration, generally from one or more operational systems or external data providers, and is cross-functional in scope.

It generally includes detailed data and summarized data and can range in size from a few gigabytes to thousands of gigabytes, terabytes, etc. An enterprise data warehouse can be performed on traditional mainframes, computer super servers, or parallel architecture platforms. It needed extensive business modeling and can take years to design and construct.

Datamart − A data mart includes a subset of corporate-wide data that is of value to a definite team of users. The scope is limited to definitely selected subjects. For instance, a marketing data mart can confine its subjects to the user, item, and sales. The data included in data marts tend to be summarized.

Virtual warehouse − A virtual warehouse is a collection of views over operational databases. For efficient query processing, only some of the possible summary views can be materialized. A virtual warehouse is simply to build but needed excess capacity on operational database servers.

Published on 22-Nov-2021 07:41:42