Overview of Data Warehousing and OLAP


Data Warehouse

A data warehouse is like a big library where we keep a lot of information from different places. It analyzes and understands the information easily. So you can make good decisions based on these facts. You have all the required information that you need in one place. We organize the information so it's easy to find and use. It takes information from different places and put it all together in one place, hence it is easier to understand.

Characteristics of Data Warehouse

Data Warehouse has the following characteristics.

Subject-oriented

A data warehouse focuses on a specific topic like sales, marketing, or distribution. It is designed to provide information about a particular theme rather than the day-to-day operations of an organization.

Integrated

A data warehouse combines data from different sources. These sources are mainframes and relational databases, into a single, reliable format. The data must be organized and structured in a way that allows for effective analysis.

Time-variant

Data in a data warehouse is maintained over time, in weekly/monthly/annual intervals. So you can do historical analysis and the ability to track changes over time.

Non-volatile

Data in a data warehouse is permanent. Data cannot be deleted or modified once it's stored. So you can do historical analysis and ensure that the data is always available in its original state.

By understanding these characteristics, organizations can use data warehouses to make better decisions by analyzing large amounts of data from different sources in a consistent and reliable way.

Data warehousing has some advantages and disadvantages.

Advantages

  • Makes data easier to understand

  • Continuous updating

  • Accessibility

Disadvantages

  • Accumulation of irrelevant data

  • Data loss and erasure

  • Data cleansing and transformation

Functions of Data warehouse

A data warehouse is a collection of data that is organized to provide various functions for managing and analyzing data. Some of the important functions of a data warehouse are −

  • Data Consolidation

  • Data Cleaning

  • Data Integration

  • Data Storage

  • Data Transformation

  • Data Analysis

  • Data Reporting

  • Data Mining

  • Performance Optimization

These functions enable organizations to manage and analyze large amounts of data from different sources, and make informed decisions based on reliable and accurate information.

Online Analytical Processing Server (OLAP)

Online Analytical Processing Server (OLAP) is a software. Users can analyze information from many different databases all at once. It uses a multidimensional data model where users can ask questions based on multiple dimensions at the same time. For example, a user could ask for sales data from Delhi in the year 2018. OLAP databases are split up into cubes, which are also called hyper-cubes.

OLAP operations

These are used to analyze data in an OLAP cube. There are five basic operations:

Drill down

This makes the data more detailed by moving down the concept hierarchy or adding a new dimension. For example, in a cube showing sales data by Quarter, drilling down would show sales data by Month.

Roll up

This makes the data less detailed by climbing up the concept hierarchy or reducing dimensions. For example, in a cube showing sales data by City, rolling up would show sales data by Country.

Dice

This selects a sub-cube by choosing two or more dimensions and criteria. For example, in a cube showing sales data by Location, Time, and Item, dicing could select sales data for Delhi or Kolkata, in Q1 or Q2, for Cars or Buses.

Slice

This selects a single dimension and creates a new sub-cube. For example, in a cube showing sales data by Location, Time, and Item, slicing by Time would create a new sub-cube showing sales data for Q1.

Pivot

This rotates the current view to get a new representation. For example, after slicing by Time, pivoting could show the same data but with Location and Item as rows instead of columns

Comparison between Data Warehousing and OLAP

Feature

Data Warehousing

OLAP

Definition

A process of collecting, storing, and managing data from various sources to provide meaningful business insights

A technology that allows users to analyze information from multiple database systems at the same time based on the multi-dimensional data model

Purpose

To make data accessible and understandable for business users

To provide quick and interactive analysis of data from multiple sources

Data structure

Relational database

Multidimensional data model

Data source

Multiple data sources

Multiple data sources

Data type

Historical data

Current and historical data

Data processing

Batch processing

Real-time processing

Operations

Data cleaning, consolidation, integration, transformation, analysis, and reporting

Drill-down, roll-up, slice, dice, and pivot

Cube creation

Not applicable

Cubes are created to support fast and efficient analysis

Query performance

Slower query performance due to complex querying and data processing

Faster query performance due to pre-aggregation and indexing

User type

Business users and data analysts

Business users and data analysts

Use case

Decision-making and strategic planning

Real-time analysis and interactive reporting

Conclusion

In conclusion, both data warehousing and OLAP serve different purposes in managing and analyzing data. Data warehousing focuses on collecting and organizing data from various sources into a single, reliable format to provide historical insights for decision-making and strategic planning. On the other hand, OLAP allows for quick and interactive analysis of current and historical data from multiple database systems based on a multidimensional data model. OLAP uses operations such as drill-down, roll-up, slice, dice, and pivot to analyze data efficiently. While data warehousing may have slower query performance due to complex querying and data processing, OLAP offers faster query performance due to pre-aggregation and indexing. Both technologies are beneficial for business users and data analysts in making informed decisions based on reliable and accurate information.

Updated on: 18-May-2023

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements