Article Categories

Selected Reading

How to use Boto3 get the details of all the databases from AWS Glue Data Catalog?

Boto3 Python Server Side Programming Programming

The AWS Glue Data Catalog stores metadata for databases, tables, and partitions. Using Boto3, Python's AWS SDK, you can retrieve details of all databases in your Glue Data Catalog with the get_databases() method.

Prerequisites

Before using this code, ensure you have ?

AWS credentials configured (via AWS CLI, environment variables, or IAM roles)
Appropriate IAM permissions for Glue operations
Boto3 library installed: pip install boto3

Basic Implementation

Here's how to retrieve all database definitions from AWS Glue Data Catalog ?

import boto3
from botocore.exceptions import ClientError

def get_all_databases():
    session = boto3.session.Session()
    glue_client = session.client('glue')
    try:
        response = glue_client.get_databases()
        return response
    except ClientError as e:
        raise Exception("boto3 client error in get_all_databases: " + str(e))
    except Exception as e:
        raise Exception("Unexpected error in get_all_databases: " + str(e))

# Execute the function
result = get_all_databases()
print(result)

Sample Output

{
    'DatabaseList': [
        {
            'Name': 'QA-test', 
            'CreateTime': datetime.datetime(2020, 11, 18, 14, 24, 46, tzinfo=tzlocal())
        },
        {
            'Name': 'custdb', 
            'CreateTime': datetime.datetime(2020, 8, 31, 20, 30, 9, tzinfo=tzlocal())
        },
        {
            'Name': 'default', 
            'Description': 'Default Hive database',
            'LocationUri': 'hdfs://ip-example.ec2.internal:8020/user/hive/warehouse', 
            'CreateTime': datetime.datetime(2018, 5, 25, 16, 4, 54, tzinfo=tzlocal())
        }
    ],
    'NextToken': 'eyJsYXN0RXZhbHVhdGVkS2V5...',
    'ResponseMetadata': {
        'RequestId': 'fa0a2069-example-a0617',
        'HTTPStatusCode': 200,
        'RetryAttempts': 0
    }
}

Enhanced Version with Pagination

For accounts with many databases, use pagination to retrieve all results ?

import boto3
from botocore.exceptions import ClientError

def get_all_databases_paginated():
    session = boto3.session.Session()
    glue_client = session.client('glue')
    
    all_databases = []
    next_token = None
    
    try:
        while True:
            if next_token:
                response = glue_client.get_databases(NextToken=next_token)
            else:
                response = glue_client.get_databases()
            
            all_databases.extend(response['DatabaseList'])
            
            if 'NextToken' not in response:
                break
            next_token = response['NextToken']
        
        return {'DatabaseList': all_databases, 'Count': len(all_databases)}
        
    except ClientError as e:
        raise Exception(f"AWS Glue error: {str(e)}")
    except Exception as e:
        raise Exception(f"Unexpected error: {str(e)}")

# Get all databases with pagination
result = get_all_databases_paginated()
print(f"Found {result['Count']} databases")
for db in result['DatabaseList']:
    print(f"- {db['Name']}: {db.get('Description', 'No description')}")

Key Response Fields

Field	Description	Always Present?
`Name`	Database name	Yes
`Description`	Database description	No
`LocationUri`	Physical location URI	No
`CreateTime`	Creation timestamp	Yes
`Parameters`	Key-value parameters	No

Conclusion

Use Boto3's get_databases() method to retrieve AWS Glue Data Catalog database metadata. Implement pagination for large datasets and proper error handling for production use.

Ashish Anand

Updated on: 2026-03-25T18:18:32+05:30

805 Views

Kickstart Your Career

Get certified by completing the course

Get Started

Previous Next