How to use Boto3 to update the scheduler of a crawler in AWS Glue Data Catalog

In this article, we will see how to update the scheduler of a crawler in AWS Glue Data Catalog using the boto3 library in Python.

Problem Statement

Use boto3 library in Python to update the scheduler of an existing crawler in AWS Glue.

Prerequisites

Before implementing the solution, ensure you have:

  • AWS credentials configured (via AWS CLI, IAM roles, or environment variables)

  • boto3 library installed: pip install boto3

  • Proper IAM permissions for Glue operations

Approach to Update Crawler Schedule

Follow these steps to update a crawler's scheduler:

  • Step 1: Import boto3 and botocore exceptions to handle errors

  • Step 2: Define required parameters: crawler_name and scheduler

  • Step 3: The scheduler format should be cron(cron_expression). For example, cron(15 12 * * ? *) runs the crawler daily at 12:15 UTC

  • Step 4: Create an AWS session and Glue client using boto3

  • Step 5: Use update_crawler_schedule() method with crawler name and schedule

  • Step 6: Handle exceptions appropriately

Example Implementation

Here's a complete example that updates a crawler's scheduler ?

import boto3
from botocore.exceptions import ClientError

def update_scheduler_of_a_crawler(crawler_name, scheduler):
    """
    Update the schedule of an AWS Glue crawler
    
    Args:
        crawler_name (str): Name of the crawler to update
        scheduler (str): Cron expression in format 'cron(expression)'
    
    Returns:
        dict: Response from AWS Glue service
    """
    session = boto3.session.Session()
    glue_client = session.client('glue')
    
    try:
        response = glue_client.update_crawler_schedule(
            CrawlerName=crawler_name,
            Schedule=scheduler
        )
        return response
    except ClientError as e:
        raise Exception(f"boto3 client error in update_scheduler_of_a_crawler: {e}")
    except Exception as e:
        raise Exception(f"Unexpected error in update_scheduler_of_a_crawler: {e}")

# Example usage
crawler_name = "Data Dimension"
schedule = "cron(15 12 * * ? *)"  # Daily at 12:15 UTC

result = update_scheduler_of_a_crawler(crawler_name, schedule)
print(result)

Expected Output

The function returns a response with metadata confirming the schedule update ?

{
    'ResponseMetadata': {
        'RequestId': '73e50130-*****************8e',
        'HTTPStatusCode': 200,
        'HTTPHeaders': {
            'date': 'Sun, 28 Mar 2021 07:26:55 GMT',
            'content-type': 'application/x-amz-json-1.1',
            'content-length': '2',
            'connection': 'keep-alive',
            'x-amzn-requestid': '73e50130-***************8e'
        },
        'RetryAttempts': 0
    }
}

Cron Expression Examples

Schedule Cron Expression Description
Daily at 2:30 AM cron(30 2 * * ? *) Runs every day at 2:30 UTC
Weekly on Sunday cron(0 6 ? * SUN *) Runs every Sunday at 6:00 UTC
Monthly on 1st cron(0 9 1 * ? *) Runs on 1st of every month at 9:00 UTC

Error Handling

Common errors you might encounter:

  • CrawlerNotFound: The specified crawler doesn't exist

  • InvalidInput: Invalid cron expression format

  • AccessDenied: Insufficient IAM permissions

Conclusion

Updating a crawler's schedule in AWS Glue is straightforward using boto3's update_crawler_schedule() method. Remember to use proper cron expression format and handle exceptions appropriately for robust automation scripts.

Updated on: 2026-03-25T18:51:25+05:30

493 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements