BigQuery - STRUCT Data Type



STRUCTs and ARRAYs are how developers store nested data within BigQuery's columnar structure.

What is a Struct?

A STRUCT is a collection of fields with a specified type, which is required, and a field name, which is optional. Notably, STRUCT types, unlike ARRAYs, can contain a mixture of data types.

To better understand the STRUCT type, take another look at the example from the previous chapter, now with a few changes.

"locations": [
   {"store_no": 4, "employee_count": 15, "store_name": "New York 5th Ave"}, 
   {"store_no": 5, "employee_count": 30, "store_name": "New York Lower Manh"}
]

While previously "locations" was a dict (expressed in JSON) of the same type, now it contains two STRUCTs which have a type of <INTEGER, INTEGER, STRING>.

  • Despite supporting a STRUCT type, BigQuery does not have an explicit label STRUCT available during the table creation stage.
  • Instead, a STRUCT is indicated as a RECORD with a NULLABLE mode.

Note − Think of a STRUCT as more of a container than a dedicated data type.

When defined in a schema, the elements inside the STRUCT will be represented and selected with a "." Here, the schema would be −

{"locations", "RECORD", "NULLABLE"},
{"locations.store_no", "INTEGER", "NULLABLE},
{"locations.employee_count", "INTEGER", "NULLABLE"},
{"locations.store_name", "STRING", "NULLABLE"}

dot Notation

To select a STRUCT element, it is necessary to use the dot notation in the FROM clause −

dot Notation

When the query gets executed, you may get an output like this −

dot Notation
Advertisements