Problem Statement − Use boto3 library in Python to download an object from S3 at a given local path/default path with overwrite existing file as true. For example, download test.zip from Bucket_1/testfolder of S3.
Step 1 − Import boto3 and botocore exceptions to handle exceptions.
Step 2 − From pathlib, import Path to check filename
Step 3 − s3_path, localpath and overwrite_existing_file are the three parameters in the function download_object_from_s3
Step 4 − Validate the s3_path is passed in AWS format as s3://bucket_name/key. By default, localpath = None and overwrite_existing_file = True. User can pass these values as well to download in a given local path
Step 5 − Create an AWS session using boto3 library.
Step 6 − Create an AWS resource for S3.
Step 7 − Split the S3 path and perform operations to separate the root bucket name and the object path to download.
Step 8 − Check whether overwrite_existing_file set as False and the file already exists in a given local path; in that case don’t do any operation.
Step 9 − Else (if any of these conditions are not true), download the object. If localpath is given, download there; else download into default path.
Step 10 − Handle the exception based on response code to validate whether the file is downloaded or not.
Step 11 − Handle the generic exception if something went wrong while downloading the file.
Use the following code to download a file from AWS S3 −
import boto3 from botocore.exceptions import ClientError from pathlib import Path def download_object_from_s3(s3path, localPath=None, overwrite_existing_file=True): if 's3://' not in s3path: print('Given path is not a valid s3 path.') raise Exception('Given path is not a valid s3 path.') session = boto3.session.Session() s3_resource = session.resource('s3') s3_tokens = s3path.split('/') bucket_name = s3_tokens object_path = "" filename = s3_tokens[len(s3_tokens) - 1] print('Filename: ' + filename) if len(s3_tokens) > 4: for tokn in range(3, len(s3_tokens) - 1): object_path += s3_tokens[tokn] + "/" object_path += filename else: object_path += filename print('object: ' + object_path) try: if not overwrite_existing_file and Path.is_file(filename): pass else: if localPath is None: s3_resource.meta.client.download_file(bucket_name, object_path, filename) else: s3_resource.meta.client.download_file(bucket_name, object_path, localPath + '/' + filename) print('Filename: ' + filename) return filename except ClientError as error: if error.response['Error']['Code'] == '404': print(s3path + " File not found: ") raise Exception(s3path + " File not found: ") except Exception as error: print("Unexpected error in download_object function of s3 helper: " + error.__str__()) raise Exception("Unexpected error in download_object function of s3 helper: " + error.__str__()) #Download into default localpath print(download_object_from_s3("s3://Bucket_1/testfolder/test.zip")) #Download into given path print(download_object_from_s3("s3://Bucket_1/testfolder/test.zip","C://AWS")) #File doesn’t exist in S3 print(download_object_from_s3("s3://Bucket_1/testfolder/abc.zip"))
#Download into default localpath Filename: test.zip object: testfolder/test.zip Filename: test.zip #Download into given path Filename: test.zip object: testfolder/test.zip Filename: test.zip #File doesn’t exist in S3 Filename: abc.zip object: testfolder/abc.zip s3://Bucket_1/testfolder/abc.zip File not found: botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found
Note: The default path to download is the directory where this function is written. In the same directory, file will be downloaded if local path is not provided.
For example, if this function is written into S3_class and this class is present at C://AWS/src/S3_class, then file test.zip will be downloaded into C://AWS/src/test.zip