BigQuery - Google Cloud Hierarchy



Before proceeding, it's important to grasp the fundamental concepts and vocabulary relating to BigQuery and its associated processes.

First, it's important to understand that even though cloud computing offers nearly limitless processing capabilities, BigQuery users will encounter issues if they are required to do the following activities −

  • Performing a computationally-taxing SQL operation like a cross join or cartesian join.
  • Attempting to run a large query without specifying a destination table.
  • Running a large query at times of peak usage (if using BigQuery as an enterprise user).
  • On-demand or "ad hoc" queries can and will create gridlock, especially if competing with scheduled processes for execution slots.

Google Cloud Hierarchy

If you anticipate creating and populating data sources within BigQuery, it is important to note the Google Cloud hierarchy −

  • Organization
  • Project
  • Dataset
  • Table

1. Organization Layer

Unless you are an account owner, executive or decision maker, you're unlikely to have to worry about the organization layer. Think of it as the entity that contains the other elements you'll encounter when navigating BigQuery Studio and writing SQL queries within the SQL environment.

2. Multiple Projects Inside Google Cloud Organization

Any Google Cloud organization can have multiple projects. Sometimes companies or enterprise users (we'll purposefully avoid the term "organization" here to avoid confusion) create different projects to separate staging and production environments.

Other times, these high-level users create distinct projects to have more control over potentially sensitive data like personally identifiable information (PII) and confidential revenue information.

In either case, when you begin using BigQuery, you'll create or receive permissions to access BigQuery as a user with specific permissions and role scopes.

3. Datasets and Tables within Project

Within the project, the most important entities to remember are datasets and tables. To clarify, a dataset contains a table or multiple tables. In order to maintain accuracy in technical discussions, do your best to avoid using these terms interchangeably.

Other elements you'll see within a dataset include −

  • Routines
  • Models
  • Views

These additional data elements will be discussed in more depth in the following chapters.

Advertisements