What is Snowflake Schema?

Data MiningDatabaseData Structure

The snowflake schema is a variant of the star schema model, where some dimension tables are normalized, by further dividing the records into additional tables. The developing schema graph forms a shape equivalent to a snowflake.

The snowflake schema is an extension of the star schema where each point of the star breaks out into more points. It is known as the snowflake schema because the snowflake schema resembles a snowflake. Snowflaking is a method of normalizing the dimension tables in STAR schemas. When it normalizes all the dimension tables completely, the resultant structure simulates a snowflake with the fact table in the middle.

Snowflaking can develop the performance of specific queries. The schema is arranged with each fact surrounded by its related dimensions, and those dimensions are associated with other dimensions, branching out into a snowflake pattern.

The snowflake schema includes one fact table which is connected to several dimension tables, which can be connected to other dimension tables through a many-to-one relationship. Tables in a snowflake schema are usually normalized to the third normal form. Each dimension table implements exactly one level in a hierarchy.

A star schema stores all attributes for a dimension into one denormalized table. This required additional disk space than a more normalized snowflake schema. Snowflaking normalizes the dimension by changing attributes with low cardinality into separate dimension tables that associate with the important dimension table by using foreign keys. The purpose of the snowflake is to minimize the disk space is not recommended, because it can adversely impact query implementation.

In snowflake, schema tables are normalized to remove redundancy. In snowflake dimension tables are broken into several dimension tables.

A snowflake schema is created for flexible querying across more difficult dimensions and relationships. It is relevant for many to many and one to many relationships among dimension levels.

Advantage of Snowflake Schema

  • The main advantage of the snowflake schema is the development in query performance because of minimized disk storage needed and joining smaller lookup tables.

  • It supports higher scalability in the interrelationship among dimension levels and components.

  • There is no redundancy, so it is simpler to maintain.

Disadvantage of Snowflake Schema

  • The main disadvantage of the snowflake schema is the additional maintenance efforts needed because of the increasing number of lookup tables. It is also defined as a multi-fact star schema.

  • There are more complex queries and therefore, it is difficult to learn.

  • Higher tables higher join so more query implementation time.

Updated on 23-Nov-2021 10:20:56