Article Categories

Selected Reading

How to use Boto3 to paginate through the job runs of a job present in AWS Glue

AWS Boto3 Python Server Side Programming Programming

In this article, we will see how to paginate through all the job runs of a job present in AWS Glue using the boto3 library. This is useful when dealing with jobs that have many runs and you need to retrieve them efficiently in smaller chunks.

Understanding Pagination Parameters

The pagination function accepts several optional parameters along with the required JobName:

max_items ? Total number of records to return. If more records are available, a NextToken will be provided for continuation.
page_size ? Number of records per page.
starting_token ? Token from previous response to continue pagination.

Step?by?Step Implementation

Step 1: Import Required Libraries

Import boto3 and botocore exceptions to handle AWS service interactions ?

import boto3
from botocore.exceptions import ClientError

Step 2: Create AWS Session and Client

Set up the AWS session and create a Glue client ?

session = boto3.session.Session()
glue_client = session.client('glue')

Step 3: Create Paginator Object

Use the get_paginator method to create a paginator for job runs ?

paginator = glue_client.get_paginator('get_job_runs')

Complete Example

Here's the complete implementation to paginate through job runs ?

import boto3
from botocore.exceptions import ClientError

def paginate_through_jobruns(job_name, max_items=None, page_size=None, starting_token=None):
    session = boto3.session.Session()
    glue_client = session.client('glue')
    
    try:
        paginator = glue_client.get_paginator('get_job_runs')
        response = paginator.paginate(
            JobName=job_name, 
            PaginationConfig={
                'MaxItems': max_items,
                'PageSize': page_size,
                'StartingToken': starting_token
            }
        )
        return response
    except ClientError as e:
        raise Exception("boto3 client error in paginate_through_jobruns: " + str(e))
    except Exception as e:
        raise Exception("Unexpected error in paginate_through_jobruns: " + str(e))

# Example usage
response = paginate_through_jobruns("glue_test_job", max_items=1, page_size=5)
for page in response:
    print(page)

Sample Output

The function returns a paginated response containing job run details ?

{
    'JobRuns': [
        {
            'Id': 'jr_435b66cfe451adf5fa7c7f914be3c87d199616f52bd13bdd91bb1269f02db705',
            'Attempt': 0,
            'JobName': 'glue_test_job',
            'StartedOn': datetime.datetime(2021, 1, 25, 22, 19, 56, 52000, tzinfo=tzlocal()),
            'LastModifiedOn': datetime.datetime(2021, 1, 25, 22, 21, 50, 603000, tzinfo=tzlocal()),
            'CompletedOn': datetime.datetime(2021, 1, 25, 22, 21, 50, 603000, tzinfo=tzlocal()),
            'JobRunState': 'SUCCEEDED',
            'Arguments': {
                '--additional-python-modules': 'pandas==1.1.5',
                '--enable-glue-datacatalog': 'true',
                '--job-bookmark-option': 'job-bookmark-disable'
            },
            'AllocatedCapacity': 2,
            'ExecutionTime': 107,
            'MaxCapacity': 2.0,
            'WorkerType': 'G.1X',
            'NumberOfWorkers': 2,
            'GlueVersion': '2.0'
        }
    ],
    'NextToken': 'eyJleHBpcmF0aW9uIjp7InNlY29uZHMiOjE2MTc0NTQ0NDgsIm5hbm9zIjo2OTUwMDAwMDB9...',
    'ResponseMetadata': {
        'RequestId': '1874370e-***********-40d',
        'HTTPStatusCode': 200
    }
}

Key Points

Use NextToken from the response to continue pagination for subsequent requests.
The response includes job run details like state, execution time, and worker configuration.
Proper error handling ensures graceful failure in case of AWS service issues.

Conclusion

Boto3 pagination helps efficiently retrieve AWS Glue job runs in manageable chunks. Use the get_paginator method with appropriate pagination config to handle large result sets without overwhelming memory usage.

Ashish Anand

Updated on: 2026-03-25T18:54:08+05:30

615 Views

Previous Next