- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to use Boto3 to get the specified version table definition of a database from AWS Glue Data Catalog?
Problem Statement − Use boto3 library in Python to retrieve the table definition of a database.
Example − Retrieve the table definition of a database ‘QA-test’ and table as ‘security’ for version 2.
Approach/Algorithm to solve this problem
Step 1 − Import boto3 and botocore exceptions to handle exceptions.
Step 2 − database_name, table_name and version_id is the mandatory parameter. It fetches definition of given table for a specified version.
Step 3 − Create an AWS session using boto3 library. Make sure region_name is mentioned in default profile. If it is not mentioned, then explicitly pass the region_name while creating the session.
Step 4 − Create an AWS client for glue.
Step 5 − Now use get_table_version function and pass the database_name as DatabaseName, table_name as TableName and version_id as VersionId parameter. Please note version_id is string so integer value should be passed as string in inverted commas.
Step 6 − It returns the definition of a given table for a specified version.
Step 7 − Handle the generic exception if something went wrong while checking the job.
Example
Use the following code to retrieve the table definition for a specified version −
import boto3 from botocore.exceptions import ClientError def retrieves_table_version_details(database_name, table_name, version_id) session = boto3.session.Session() glue_client = session.client('glue') try: response = glue_client.get_table_version(DatabaseName = database_name, TableName = table_name, VersionId = version_id) return response except ClientError as e: raise Exception("boto3 client error in retrieves_table_version_details: " + e.__str__()) except Exception as e: raise Exception("Unexpected error in retrieves_table_version_details: " + e.__str__()) print(retrieves_table_version_details('QA-test', 'security', '2'))
Output
{'TableVersion': {'Table': {'Name': 'security', 'DatabaseName': 'QAtest', 'Owner': 'owner', 'CreateTime': datetime.datetime(2020, 9, 10, 22, 27, 24, tzinfo=tzlocal()), 'UpdateTime': datetime.datetime(2021, 3, 1, 11, 43, 49, tzinfo=tzlocal()), 'LastAccessTime': datetime.datetime(2020, 9, 10, 22, 27, 24, tzinfo=tzlocal()), 'Retention': 0, 'StorageDescriptor': {'Columns': [{'Name': 'assettypecode', 'Type': 'string'}, {'Name': 'industrysector', 'Type': 'varchar'}, {'Name': 'securitycode', 'Type': 'char'}, {'Name': 'contractsize', 'Type': 'string'}, {'Name': 'conversionperiodenddate', 'Type': 'string'}, {'Name': 'conversionperiodstartdate', 'Type': 'string'}, {'Name': 'expirationdate', 'Type': 'string'}, {'Name': 'issuercountrycode', 'Type': 'string'}, {'Name': 'issuercountrydesc', 'Type': 'string'}, {'Name': 'originalissuedate', 'Type': 'string'}, {'Name': 'securitynamelong', 'Type': 'string'}, {'Name': 'issueshortname', 'Type': 'string'}, {'Name': 'gicssector', 'Type': 'string'}, {'Name': 'maturitydate', 'Type': 'string'}, {'Name': 'optioncode', 'Type': 'string'}, {'Name': 'optiontypename', 'Type': 'string'}, {'Name': 'paramount', 'Type': 'string'}, {'Name': 'priceindex', 'Type': 'string'}, {'Name': 'countrycoderisk', 'Type': 'string'}, {'Name': 'countrydescrisk', 'Type': 'string'}, {'Name': 'countrycode', 'Type': 'string'}], 'Location': 's3://test/security/', 'InputFormat': 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat', 'OutputFormat': 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat', 'Compressed': False, 'NumberOfBuckets': -1, 'SerdeInfo': {'SerializationLibrary': 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe', 'Parameters': {'serialization.format': '1'}}, 'BucketColumns': [], 'SortColumns': [], 'Parameters': {'CrawlerSchemaDeserializerVersion': '1.0', 'CrawlerSchemaSerializerVersion': '1.0', 'UPDATED_BY_CRAWLER': 'security', 'averageRecordSize': '181', 'classification': 'parquet', 'compressionType': 'none', 'objectCount': '5', 'recordCount': '154800', 'sizeKey': '20337230', 'typeOfData': 'file'}, 'StoredAsSubDirectories': False}, 'PartitionKeys': [], 'TableType': 'EXTERNAL_TABLE', 'Parameters': {'CrawlerSchemaDeserializerVersion': '1.0', 'CrawlerSchemaSerializerVersion': '1.0', 'UPDATED_BY_CRAWLER': 'security', 'averageRecordSize': '181', 'classification': 'parquet', 'compressionType': 'none', 'objectCount': '5', 'recordCount': '154800', 'sizeKey': '20337230', 'typeOfData': 'file'}, 'CreatedBy': 'arn:aws:sts::*********:assumed-role/glue-role/AWS-Crawler'}, 'VersionId': '2'}, 'ResponseMetadata': {'RequestId': '431db171- *******************0', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Mon, 01 Mar 2021 06:15:30 GMT', 'content-type': 'application/x-amzjson-1.1', 'content-length': '3916', 'connection': 'keep-alive', 'xamzn-requestid': '431db171-*****************0'}, 'RetryAttempts': 0}}
- Related Articles
- How to use Boto3 to get the table definition of a database from AWS Glue Data Catalog?
- How to get the table definition in a database from AWS Glue Data Catalog using Boto3
- How to use Boto3 to delete a specific version of table from AWS Glue Data catalog?
- How to use Boto3 to get the details of a database from AWS Glue Data Catalog?
- How to use Boto3 to delete a table from AWS Glue Data catalog?
- How to use Boto3 to get the details of a classifier from AWS Glue Data catalog?
- How to use Boto3 to get the details of a connection from AWS Glue Data catalog?
- How to use Boto3 to get the security configuration/encryption settings of a catalog from AWS Glue Data Catalog?
- How to use Boto3 to delete a crawler from AWS Glue Data Catalog?
- How to use Boto3 get the details of all the databases from AWS Glue Data Catalog?
- How to use Boto3 to delete a database from AWS Data Catalog?
- How to use Boto3 to get the metrics of one/manyspecified crawler from AWS Glue Data Catalog?
- How to get the details of a trigger from AWS Glue Data catalog using Boto3
- How to get the details of a user-defined function in a database from AWS Glue Data catalog using Boto3
- How to use Boto3 to start a crawler in AWS Glue Data Catalog
