Article Categories

Selected Reading

How to use Boto3 to update the scheduler of a crawler in AWS Glue Data Catalog

AWS Boto3 Python Server Side Programming Programming

In this article, we will see how to update the scheduler of a crawler in AWS Glue Data Catalog using the boto3 library in Python.

Problem Statement

Use boto3 library in Python to update the scheduler of an existing crawler in AWS Glue.

Prerequisites

Before implementing the solution, ensure you have:

AWS credentials configured (via AWS CLI, IAM roles, or environment variables)
boto3 library installed: pip install boto3
Proper IAM permissions for Glue operations

Approach to Update Crawler Schedule

Follow these steps to update a crawler's scheduler:

Step 1: Import boto3 and botocore exceptions to handle errors
Step 2: Define required parameters: crawler_name and scheduler
Step 3: The scheduler format should be cron(cron_expression). For example, cron(15 12 * * ? *) runs the crawler daily at 12:15 UTC
Step 4: Create an AWS session and Glue client using boto3
Step 5: Use update_crawler_schedule() method with crawler name and schedule
Step 6: Handle exceptions appropriately

Example Implementation

Here's a complete example that updates a crawler's scheduler ?

import boto3
from botocore.exceptions import ClientError

def update_scheduler_of_a_crawler(crawler_name, scheduler):
    """
    Update the schedule of an AWS Glue crawler
    
    Args:
        crawler_name (str): Name of the crawler to update
        scheduler (str): Cron expression in format 'cron(expression)'
    
    Returns:
        dict: Response from AWS Glue service
    """
    session = boto3.session.Session()
    glue_client = session.client('glue')
    
    try:
        response = glue_client.update_crawler_schedule(
            CrawlerName=crawler_name,
            Schedule=scheduler
        )
        return response
    except ClientError as e:
        raise Exception(f"boto3 client error in update_scheduler_of_a_crawler: {e}")
    except Exception as e:
        raise Exception(f"Unexpected error in update_scheduler_of_a_crawler: {e}")

# Example usage
crawler_name = "Data Dimension"
schedule = "cron(15 12 * * ? *)"  # Daily at 12:15 UTC

result = update_scheduler_of_a_crawler(crawler_name, schedule)
print(result)

Expected Output

The function returns a response with metadata confirming the schedule update ?

{
    'ResponseMetadata': {
        'RequestId': '73e50130-*****************8e',
        'HTTPStatusCode': 200,
        'HTTPHeaders': {
            'date': 'Sun, 28 Mar 2021 07:26:55 GMT',
            'content-type': 'application/x-amz-json-1.1',
            'content-length': '2',
            'connection': 'keep-alive',
            'x-amzn-requestid': '73e50130-***************8e'
        },
        'RetryAttempts': 0
    }
}

Cron Expression Examples

Schedule	Cron Expression	Description
Daily at 2:30 AM	`cron(30 2 * * ? *)`	Runs every day at 2:30 UTC
Weekly on Sunday	`cron(0 6 ? * SUN *)`	Runs every Sunday at 6:00 UTC
Monthly on 1st	`cron(0 9 1 * ? *)`	Runs on 1st of every month at 9:00 UTC

Error Handling

Common errors you might encounter:

CrawlerNotFound: The specified crawler doesn't exist
InvalidInput: Invalid cron expression format
AccessDenied: Insufficient IAM permissions

Conclusion

Updating a crawler's schedule in AWS Glue is straightforward using boto3's update_crawler_schedule() method. Remember to use proper cron expression format and handle exceptions appropriately for robust automation scripts.

Ashish Anand

Updated on: 2026-03-25T18:51:25+05:30

513 Views

Previous Next