
- BigQuery - Home
- BigQuery - Overview
- BigQuery - Initial Setup
- BigQuery vs Local SQL Engines
- BigQuery - Google Cloud Console
- BigQuery - Google Cloud Hierarchy
- What is Dremel?
- What is BigQuery Studio?
- BigQuery - Datasets
- BigQuery - Tables
- BigQuery - Views
- BigQuery - Create Table
- BigQuery - Basic Schema Design
- BigQuery - Alter Table
- BigQuery - Copy Table
- Delete and Recover Table
- BigQuery - Populate Table
- Standard SQL vs Legacy SQL
- BigQuery - Write First Query
- BigQuery - CRUD Operations
- Partitioning & Clustering
- BigQuery - Data Types
- BigQuery - Complex Data Types
- BigQuery - STRUCT Data Type
- BigQuery - ARRAY Data Type
- BigQuery - JSON Data Type
- BigQuery - Table Metadata
- BigQuery - User-defined Functions
- Connecting to External Sources
- Integrate Scheduled Queries
- Integrate BigQuery API
- BigQuery - Integrate Airflow
- Integrate Connected Sheets
- Integrate Data Transfers
- BigQuery - Materialized View
- BigQuery - Roles & Permissions
- BigQuery - Query Optimization
- BigQuery - BI Engine
- Monitoring Usage & Performance
- BigQuery - Data Warehouse
- Challenges & Best Practices
BigQuery - Datasets
What are Datasets in BigQuery?
Datasets are entities that live within a project. Datasets act as a container for BigQuery tables as well as views, routines and machine learning models.
Tables cannot live separately from datasets, so it is a requirement to create a dataset when creating a new data source within BigQuery Studio.
In addition to attributes like a human-readable name, developers are required to specify a location when authorizing the creation of a dataset. These locations correspond with the physical locations of Google data centers throughout the world.
When specifying a location, you'll need to specify either a single region or multi-region. For instance, instead of choosing a data center in Chicago, you would specify "us-central-1."
Establishing a dataset as a multi-regional entity provides the added advantage of BigQuery shifting the location when a particular region does not have the resources to keep up with current demand. The current multi-regions are located in either the Americas (U.S.) or EU (Europe).
Steps to Create a Dataset in BigQuery
To create a dataset, follow the steps given below. First, navigate to your project name and click the three dots which will trigger a pop-up with "create dataset" −

Once you click "create dataset", you'll be prompted to enter −
- A dataset_id
- A location type (region vs. multi-region).
- A default table expiration (how many days until the table expires).

The end result is a dataset which serves as a container for future tables, views and materialized views.

A "Sharing" option allows developers to manage access control to datasets in order to limit unauthorized users.

BigQuery: Public Datasets
If you're new to BigQuery and, possibly, SQL in general, it's likely you may not have generated data to store and manipulate. This is one of the advantages of using BigQuery Studio as a SQL sandbox. In addition to serverless infrastructure, BigQuery also provides terabytes of sample data that students and professionals can use to learn and refine their SQL skills.
- Published through the Google Cloud Public Dataset Program, BigQuery public datasets are stored in their own universally-accessible project: bigquery-public-data.
- Developers can query up to 1 terabyte of data per month for free, in accordance with the pay-per-terabyte pricing model.
- Unlike many stock datasets, the data contained within the tables is realistic, a.k.a. "messy" and, at times, requires significant transformation to yield actionable insights.
BigQuery also provides several sample tables independent of its BigQuery public datasets which can be found in the bigquery-public-data:samples table dataset −
- gsod
- github_nested
- github_timeline
- natality
- shakespeare
- trigrams
- wikipedia
Perhaps the most significant advantage of accessing BigQuery public datasets is the fact that the data is ingested from real data sources like the BBC, Hacker News and Johns Hopkins University.