How to use Boto3 to check the status of a running Glue Job?


Problem Statement − Use boto3 library in Python to run a glue job and get status whether it succeeded or failed. For example, run the job run_s3_file_job and get it status.

Approach/Algorithm to solve this problem

Step 1 − Import boto3 and botocore exceptions to handle exceptions.

Step 2job_name is the mandatory parameter, while arguments is the optional parameter in the function. Few jobs take arguments to run. In that case, arguments can be passed as dict.

For example: arguments = {‘arguments1’ = ‘value1’, ‘arguments2’ = ‘value2’}

If the job doesn’t take arguments, then just pass the job_name.

Step 3 − Create an AWS session using boto3 library. Make sure region_name is mentioned in default profile. If it is not mentioned, then explicitly pass the region_name while creating the session.

Step 4 − Create an AWS client for glue.

Step 5 − Now use start_job_run function and pass the JobName and arguments if require.

Step 6 − Once the job starts, it provides the job_run_id with the metadata of the job.

Step 7 − Use the function get_job_run and pass the parameter RunId from the result of the previous function. It returns the dictionary about status.

Step 8 − Now, get specific status of the job. Status could be Running if job is not completed else SUCCEEDED/FAILED.

Step 9 − Handle the generic exception if something went wrong while checking the job.

Example

Use the following code to run and get the status of an existing glue job −

import boto3
from botocore.exceptions import ClientError

def run_glue_job_get_status(job_name, arguments = {}):
   session = boto3.session.Session()
   glue_client = session.client('glue')
   try:
      job_run_id = glue_client.start_job_run(JobName=job_name, Arguments=arguments)
      status_detail = glue_client.get_job_run(JobName=job_name, RunId = job_run_id.get("JobRunId"))
      status = status_detail.get("JobRun").get("JobRunState")
      return status
   except ClientError as e:
      raise Exception( "boto3 client error in run_glue_job_get_status: " + e.__str__())
   except Exception as e:
      raise Exception( "Unexpected error in run_glue_job_get_status: " + e.__str__())

#Get status 1st time
print(run_glue_job_get_status("run_s3_file_job"))
#Get status 2nd time after waiting
time.sleep(10)
print(run_glue_job_get_status("run_s3_file_job"))

Output

##Get status 1st time
Running
#Get status 2nd time after waiting
SUCCEEDED

Updated on: 22-Mar-2021

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements