
- BigQuery - Home
- BigQuery - Overview
- BigQuery - Initial Setup
- BigQuery vs Local SQL Engines
- BigQuery - Google Cloud Console
- BigQuery - Google Cloud Hierarchy
- What is Dremel?
- What is BigQuery Studio?
- BigQuery - Datasets
- BigQuery - Tables
- BigQuery - Views
- BigQuery - Create Table
- BigQuery - Basic Schema Design
- BigQuery - Alter Table
- BigQuery - Copy Table
- Delete and Recover Table
- BigQuery - Populate Table
- Standard SQL vs Legacy SQL
- BigQuery - Write First Query
- BigQuery - CRUD Operations
- Partitioning & Clustering
- BigQuery - Data Types
- BigQuery - Complex Data Types
- BigQuery - STRUCT Data Type
- BigQuery - ARRAY Data Type
- BigQuery - JSON Data Type
- BigQuery - Table Metadata
- BigQuery - User-defined Functions
- Connecting to External Sources
- Integrate Scheduled Queries
- Integrate BigQuery API
- BigQuery - Integrate Airflow
- Integrate Connected Sheets
- Integrate Data Transfers
- BigQuery - Materialized View
- BigQuery - Roles & Permissions
- BigQuery - Query Optimization
- BigQuery - BI Engine
- Monitoring Usage & Performance
- BigQuery - Data Warehouse
- Challenges & Best Practices
BigQuery - User-defined Functions
One of the advantages of BigQuery is the ability to create custom logic to manipulate data. In programming languages like Python, developers can easily write and define functions that can be utilized in multiple places within a script.
Persistent User-defined Functions in BigQuery
Many SQL dialects, including BigQuery, support these functions. BigQuery calls them persistent user-defined functions. They're either UDFs (user-defined functions) or PUDFs (persistent user-defined functions) for short.
The essence of user-defined functions can be broken into two steps −
- Defining function logic
- Employing a function within a script
Defining a User-defined Function
Defining a user-defined function begins with a familiar CRUD statement: CREATE OR REPLACE.
Here, instead of CREATE OR REPLACE TABLE, it is necessary to use CREATE OR REPLACE FUNCTION followed by the AS() command.
Unlike other SQL queries that can be written within BigQuery, when creating a UDF, specifying an input field and type is required.
These inputs are defined in a similar way to a Python function −
(column_name, type)
To see these steps put together, I've created a simple temporary UDF, specified by TEMP FUNCTION that parses various URLs based on user input.

The steps to create the above temporary function are −
- CREATE TEMP FUNCTION
- Specify function name (get_sitetype)
- Specify function input and type (inp_url, STRING)
- Tell the function what type to return (STRING)
The REGEXP_CONTAINS() function searches for a match on strings that contain the URL strings provided. The NET.HOST() function extracts the host domain from the input URL string.
Applying this to the hacker news dataset (a BigQuery public dataset), we can generate an output that classifies the stored URLs into different categories of media −

Note − Temporary functions must be immediately followed by a query.