BigQuery - ARRAY Data Type



ARRAY vs STRUCT

Unlike a STRUCT type, which is allowed to contain data of differing types, an ARRAY data type must contain elements of the same type.

  • In programming languages like Python, an ARRAY (also known as a list) is represented with brackets: [ ].
  • STRUCT types can contain other structs (to create very nested data), an ARRAY may not contain another ARRAY.
  • However, an ARRAY can contain a STRUCT, so a developer could encounter an ARRAY with several STRUCTs embedded within.

BigQuery will not label a column as an explicit ARRAY type. Instead, it is represented with a different mode. While a regular STRING type has a "NULLABLE" mode, the ARRAY type has a "REPEATED" mode.

ARRAY vs STRUCT

STRUCT types can be selected using a dot, however ARRAY types are more limited when it comes to surface-level manipulation.

It is possible to select an ARRAY as a grouped element:

Select an ARRAY

However, selecting an element of an array using a dot or any other method is impossible, so this wouldn't work −

Select an ARRAY

The UNNEST() Function

To access names from within store_information, it is necessary to perform an extra step: UNNEST(), a function which will flatten the data so it is more accessible.

The UNNEST() function is used in the FROM clause with a comma. For context: The comma represents an implicit CROSS JOIN.

To properly access this ARRAY, use the following query −

The UNNEST() Function

It will fetch you an output table like this −

The UNNEST() Function Output

In addition to using UNNEST(), it's also possible to alias the record. The resulting alias, "wd", can then be used to access the unnested data.

Advertisements