BigQuery - Materialized View



In addition to the creation of tables and views, BigQuery also facilitates the creation of materialized views.

What is a Materialized View?

A materialized view is similar to a table in that it is a "snapshot" of data. However, the materialized view differs in that a true materialized view will update dynamicallyall without needing to run a query.

Types of Materialized Views

Broadly, there are two kinds of materialized views −

  • A view that has been saved as a table and will be updated on a recurring basis from an outside source.
  • A "true" materialized view created in BigQuery Studio.

Here is example architecture for the first kind of view −

  • SQL queries an existing view
  • Within a Python script, that view is converted to a dataframe
  • The dataframe is uploaded to BigQuery
  • The materialized view is either appended to or overwritten

Since the above outlined a multi-step process, BigQuery has streamlined the creation and maintenance of materialized views.

Creating a BigQuery Materialized View

Users can create a BigQuery materialized view by running the CREATE MATERIALIZED VIEW SQL statement followed by −

  • Project
  • Dataset
  • New mv name
  • SQL statement

Example

Here's an example in which an existing table with hypothetical sales data is materialized −

Creating a BigQuery Materialized View

Note the following limitations, per BigQuery documentation −

  • Each table is limited to 20 materialized views within a dataset
  • There can only be 100 materialized views within a project
  • There can only be 500 materialized views within an organization

Write a Simple SQL Script in BigQuery

Putting everything together now, it's time to write a simple script that will −

  • Utilize dynamic variables
  • DELETE yesterday's data
  • INSERT new data into a table
  • Use a query to select/load data

One area not covered so far is how to define and use variables within a SQL script.

In BigQuery, variable syntax is as follows −

DECLARE variable_name TYPE DEFAULT function used to create dynamic variable

As an example

DECLARE yesterday DATE DEFAULT DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)

And this is how we'll begin the following script, which deletes previous data from the Austin bike share partitioned table and inserts only the most recent data.

Write a Simple SQL Script in BigQuery

When running this in BigQuery, the SQL engine will run this in stages due to the semicolon. The final results can be seen by clicking "View results."

Write a Simple SQL Script in BigQuery

Clicking on "View results" generates this output.

Write a Simple SQL Script in BigQuery

And, finally, we can see the new rows added to the table.

New Rows Added Table
Advertisements