Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to use Boto3 to check whether a Glue Job exists or not?
Problem Statement ? Use boto3 library in Python to check whether a glue job exists or not. For example, check whether run_s3_file_job exists in AWS glue or not.
Approach/Algorithm to solve this problem
Step 1 ? Import boto3 and botocore exceptions to handle exceptions.
Step 2 ? job_name is the parameters in function.
Step 3 ? Create an AWS session using boto3 library. Make sure region_name is mentioned in default profile. If it is not mentioned, then explicitly pass the region_name while creating the session.
Step 4 ? Create an AWS client for glue.
Step 5 ? Now use the get_job function and pass the JobName.
Step 6 ? If the job exists, the response will contain all the details about the job, else it throws an exception.
Step 7 ? Handle the generic exception if something went wrong while checking the job.
Example
Use the following code to check whether a glue job exists or not ?
import boto3
from botocore.exceptions import ClientError
def check_glue_job_exists(job_name):
session = boto3.session.Session()
glue_client = session.client('glue')
try:
response = glue_client.get_job(JobName=job_name)
return response
except ClientError as e:
raise Exception("boto3 client error in check_glue_job_exists: " + e.__str__())
except Exception as e:
raise Exception("Unexpected error in check_glue_job_exists: " + e.__str__())
# To check existing job
print(check_glue_job_exists("run_s3_file_job"))
# Job doesn't exist
print(check_glue_job_exists("run_s3_file_job_not_exist"))
Improved Version with Better Error Handling
Here's an improved version that returns a boolean value and handles exceptions more gracefully ?
import boto3
from botocore.exceptions import ClientError
def glue_job_exists(job_name):
"""Check if a Glue job exists and return True/False"""
try:
session = boto3.session.Session()
glue_client = session.client('glue')
glue_client.get_job(JobName=job_name)
return True
except ClientError as e:
if e.response['Error']['Code'] == 'EntityNotFoundException':
return False
else:
raise Exception(f"Error checking Glue job: {e}")
except Exception as e:
raise Exception(f"Unexpected error: {e}")
# Example usage
job_name = "run_s3_file_job"
if glue_job_exists(job_name):
print(f"Job '{job_name}' exists!")
else:
print(f"Job '{job_name}' does not exist.")
# Check non-existing job
non_existing_job = "run_s3_file_job_not_exist"
if glue_job_exists(non_existing_job):
print(f"Job '{non_existing_job}' exists!")
else:
print(f"Job '{non_existing_job}' does not exist.")
Output
# Original function output for existing job
{'Job': {'Name': 'run_s3_file_job', 'Description': 'Glue job for the test', 'Role': 'arn:aws:iam::12345:role/delegated/glue-service-role', 'CreatedOn': datetime.datetime(2021, 02, 10, 15, 7, 3, 638000, tzinfo=tzlocal()), 'LastModifiedOn': datetime.datetime(2021, 02, 10, 15, 7, 3, 638000, tzinfo=tzlocal()), 'ExecutionProperty': {'MaxConcurrentRuns': 1}, 'Command': {'Name': 'glueetl', 'ScriptLocation': 's3://test/pipeline.py', 'PythonVersion': '3'}, 'DefaultArguments': {'--job-language': 'python', 'Step': '0'}, 'MaxRetries': 0, 'AllocatedCapacity': 4, 'Timeout': 2880, 'MaxCapacity': 4.0, 'WorkerType': 'G.1X', 'NumberOfWorkers': 4, 'GlueVersion': '2.0'}, 'ResponseMetadata': {'RequestId': 'e3ec9e2c-e75d-4443-bfea-fef674fff7e9', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Sat, 13 Feb 2021 13:20:27 GMT', 'content-type': 'application/x-amz-json-1.1', 'content-length': '1501', 'connection': 'keep-alive', 'x-amzn-requestid': 'e3ec9e2c-e75d-4443-bfea-fef674fff7e9'}, 'RetryAttempts': 0}}
# Job doesn't exist
botocore.errorfactory.EntityNotFoundException: An error occurred (EntityNotFoundException) when calling the GetJob operation: Job with name: run_s3_file_job_not_exist not found.
# Improved function output
Job 'run_s3_file_job' exists!
Job 'run_s3_file_job_not_exist' does not exist.
Key Points
- The
get_job()method returns job details if the job exists - It raises
EntityNotFoundExceptionif the job doesn't exist - The improved version returns a simple boolean for easier conditional logic
- Always handle AWS credentials and region configuration properly
Conclusion
Use boto3's get_job() method to check if a Glue job exists. Handle EntityNotFoundException for non-existing jobs. The improved version returns boolean values for cleaner conditional logic.
