
- BigQuery - Home
- BigQuery - Overview
- BigQuery - Initial Setup
- BigQuery vs Local SQL Engines
- BigQuery - Google Cloud Console
- BigQuery - Google Cloud Hierarchy
- What is Dremel?
- What is BigQuery Studio?
- BigQuery - Datasets
- BigQuery - Tables
- BigQuery - Views
- BigQuery - Create Table
- BigQuery - Basic Schema Design
- BigQuery - Alter Table
- BigQuery - Copy Table
- Delete and Recover Table
- BigQuery - Populate Table
- Standard SQL vs Legacy SQL
- BigQuery - Write First Query
- BigQuery - CRUD Operations
- Partitioning & Clustering
- BigQuery - Data Types
- BigQuery - Complex Data Types
- BigQuery - STRUCT Data Type
- BigQuery - ARRAY Data Type
- BigQuery - JSON Data Type
- BigQuery - Table Metadata
- BigQuery - User-defined Functions
- Connecting to External Sources
- Integrate Scheduled Queries
- Integrate BigQuery API
- BigQuery - Integrate Airflow
- Integrate Connected Sheets
- Integrate Data Transfers
- BigQuery - Materialized View
- BigQuery - Roles & Permissions
- BigQuery - Query Optimization
- BigQuery - BI Engine
- Monitoring Usage & Performance
- BigQuery - Data Warehouse
- Challenges & Best Practices
BigQuery - Complex Data Types
BigQuery, in addition to supporting "regular" data types like STRING, INTEGER and BOOLEAN, also provides support for so-called complex data. Often this is also referred to as nested data, since the data does not fit into a conventionally flat table and must live in a subset of a column.
Complex Data Structures Are Common
Support for nested schemas allows for more streamlined loading processes. Although Google Cloud lists various tutorials regarding the following data types as "advanced", nested data is very common.
Knowing how to flatten and work with these data types is a marketable skill for any SQL-oriented developer.
The reason these data structures are common are because source data, like a JSON output from an API, often returns data in this format −
[{data: "id": '125467", "name": "Acme Inc.", "locations": {"store_no": 4, "employee_count": 15}}]
- Notice how both "id" and "name" are on the same level of reference. These could each be accessed using "data" as a key.
- However, to get fields like "store_no" and "employee_count", it would be necessary to not only access the data key, but also to flatten the "locations" array.
This is where BigQuery's complex data type support is helpful. Instead of a data engineer needing to write a script that would iterate through and unnest "locations", this data could be loaded into a BigQuery table as is.
Complex Data Structures in BigQuery
BigQuery supports the following three types of complex or nested data −
- STRUCT
- ARRAY
- JSON
Strategies for handling these types will be explained in the next chapters.