
- Amazon Q Business - Workflow
- Amazon Q Business - Key Concepts
- Amazon Q Business - Subscription Tiers & Index Types
- Amazon Q Business - Service Quotas
- Amazon Q Business - Document Attributes
- Amazon Q Business - Setup
- Amazon Q Business - Identity Center Directory
- Amazon Q Business - Identity Center Integrated Application
- Amazon Q Business - Identity Federation Application
- Amazon Q Business - Data Sources Connectors
- Amazon Q Business - Enhance Application
- Amazon Q Business - Features
- Amazon Q Business - Security
- Amazon Q Business - Monitoring
- Amazon Q Business API Reference
- Amazon Q Business - API Overview
- Amazon Q Business - API References
- Amazon Q Business - Supported Actions
- Amazon Q Business - Supported Data Types
- Amazon Q Business - Common Parameters
- Amazon Q Business - Common Errors
- Amazon Q Developer User Guide
- Amazon Q Developer - Introduction
- Amazon Q Developer - Getting Started
- Amazon Q Developer - On AWS
- Amazon Q Developer - In IDE
- Amazon Q Developer - Command Line
- Amazon Q Developer - Customization
- Amazon Q Developer - Security
- Amazon Q Developer - Monitoring
- Amazon Q Developer - Supported Region & Service Rename
- Amazon Q Developer - Document History
Amazon Q Business - Data Sources Connectors
Data Source Connector is a technique for combining and modifying data from different data sources into a single container index. Amazon Q Business provides multiple data source connectors to help create smart generative AI solutions with minimum configuration.
This chapter provides an overview of data source connector features, its configuration, and information specific to your data source connector.
Data Sources Connectors Concepts
To understand the configuration of data source connectors, need to understand some specific terminology related to them.
- Source and endpoint metadata: The data source configuration information is found in the source section of the console. If you use the API, you specify this information using the configuration parameter of the CreateDataSource operation. Different connection sources has configuration information depending upon data sources.
- Authorization: Amazon Q Business contains connectors indexAccess Control list(ACL) that has information regarding user email address, group name for the local group, group name for the federated group.
- Authentication: Amazon Q Business has AWS Secrets Manager secret that helps Amazon Q Business to authenticate access to your data source by data source access credentials provided by you.
- Virtual private cloud: Amazon Q Business has Virtual Private Cloud that stores data sources or databases. You can use Amazon VPC with either the console or the Amazon Q Business API
- Web proxy: is used to connect data source instance to all supported data sources for that you must provide the host name and port number.
- IAM role: Data source connectors requires IAM role that has Authorization and Authentication.
- Identity crawler: Amazon Q Business has identity crawling feature that enable it to crawl ACL information at the document level from supported data sources.
- Sync scope: Amazon Q Business has Sync Scope feature to customize the content crawled and indexed by your data source connector.
- Sync mode: Used to customize what content gets synced with your index when your data source content changes.
- Sync run schedule: Amazon Q Business has Sync run schedule feature that enables to periodically sync your data source with your retriever on a custom schedule.
- Field mappings: Used to map Amazon Q Business index fields with data source document attributes.
What is a document?
When you connect Amazon Q Business to a data source, what gets treated as a single 'document' depends on the type of connection you're using.
The following table outlines what each connector crawls as a document.
Data source connector | Supports crawling | Document definition |
---|---|---|
Adobe Experience Manager (Cloud and Server) |
|
|
Alfresco (Cloud and Server) |
|
|
Amazon FSx (Windows) | Files | Each File is considered a single document. |
Amazon S3 | Objects | Each Object is considered a single document. Any object-name.metadata.json file and access control list (ACL) file is considered metadata for the object it is associated with and not treated as a separate document. |
Amazon Q Business Web Crawler |
|
|
Amazon WorkDocs |
|
|
Box |
|
|
Confluence (Cloud and Server) |
|
|
Database data sources
|
|
|
Dropbox |
|
|
Drupal |
|
|
GitHub (Cloud and Server) |
|
|
Gmail |
|
|
Google Drive |
|
|
Jira |
|
|
Microsoft Exchange |
|
|
Microsoft OneDrive |
|
|
Microsoft SharePoint (Online and Server) |
|
|
Microsoft Teams |
|
|
Microsoft Yammer |
|
|
Quip |
|
|
Salesforce |
|
|
ServiceNow |
|
|
Slack |
|
|
Zendesk |
|
|
Configuration Best Practices
The following list describes best practices for setting up and configuring your Amazon Q Business data source connector:
- Each document in an index must be unique. Ensure there are no duplicate documents within a data source, or across any data sources, that you plan to connect to an Amazon Q Business retriever.
- When changing authentication type or credentials, update the IAM role to access the correct AWS Secrets Manager secret ID.
- For your own security, make sure to regularly update your credentials and secrets. Only give access to what is needed and don't reuse them across different data sources.
- IAM roles used for data retrievers cannot be used for data sources. If you are unsure about the role's usage, create a new IAM role to prevent errors.
- When using AWS KMS keys in your application, ensure that the IAM role for your application environment has the necessary permissions to describe, encrypt, and decrypt data using the key.
- Amazon Q Business enhances security by using Secrets Manager to verify endpoint information used to access on-premises or server data sources, preventing the "confused deputy" problem where users without direct access might gain access indirectly through a proxy. Changes in endpoint creates a new secret in Secrets Manager to reflect the updated information.
- Most data sources use regular expression patterns, which are inclusion or exclusion patterns referred to as filters.
Understanding User Store
Amazon Q Business has User Store feature that allows users to only see chat responses generated from documents they have access to within the application. This means that users can only see responses that are relevant to their permissions and the data they are authorized to view.
How the User Store works?
The following steps showing the working of Amazon Q Business User Store
- In Amazon Q Business, each document in any data source has access control list (ACL) information inherently attached to it as metadata.
- The ACLs contain information about which users and groups have access to a document.
- Then Connectors can crawl and use ACL information from your data source.
- And Re-sync your data source to capture ACL changes and ensure correct user access.
- Amazon Q Business crawls user and group information from each data source and maps it internally.
- Then User and group information is stored in the User Store for matching document access details.
- If you delete a group in the User Store and then re-create it later with the same name but with different group members, document ACLs which contain this group may be impacted.
- Delete the old user from the User Store if a new user has the same email address. Amazon Q Business will verify user attributes and deny access if there are discrepancies.
Using Amazon VPC
Amazon Q Business can connect to your Virtual private cloud (VPC) to index content. It can do this because you can tell Amazon Q Business the security information it needs to access your VPC. This way, Amazon Q Business can securely communicate with your data source within your virtual private cloud.
Troubleshooting Data Source Connectors
Now we are going to fix some issues with Amazon Q Business data source connectors.
- My documents were not indexed: Amazon Q Business has a two-step process for indexing data. Errors can occur at either the data source level or at the document level. Data source errors are reported in the console, while document level errors are reported in Amazon CloudWatch Logs. This helps you identify and fix any issues that prevent documents from being indexed.
- My synchronization job failed: Amazon Q Business synchronization jobs can fail due to configuration errors in the index or the data source. These errors are usually related to insufficient IAM permissions for Amazon Q Business to access the resources it needs. The error message in the Sync run history section of the data source details page provides details about the missing permissions. Following are some of the error messages that you can receive:
- Failed to create log group for job. Please make sure that the IAM role provided has sufficient permissions.
- Failed to access Amazon S3 file prefix (bucket name) while trying to crawl your metadata files. Please make sure the IAM role (ARN) provided has sufficient permissions.
- The provided IAM role (ARN) could not be assumed. Please make sure Amazon Q Business is a trusted entity that is allowed to assume the role.
- My synchronization job is incomplete: To troubleshoot an incomplete synchronization job, look first to your CloudWatch logs.
- From the details column, choose View details in CloudWatch.
- Review the error messages to see what caused the document to fail.
- My synchronization job succeeded but there are no indexed documents: Possible reasons include the following:
- Check CloudWatch DocumentsSubmittedForIndexingFailed metric to see if any documents failed to synchronize. Check your CloudWatch logs for details.
- For an Amazon S3 data source, you might have given Amazon Q Business the wrong bucket name or prefix. Make sure that the S3 bucket that Amazon Q Business is using is the bucket that contains the documents to index.
- When re-indexing a document that failed to be indexed in an earlier job, Amazon Q Business won't index it unless you've changed the document or its associated metadata file.
-
I am running into file format issues while syncing my data source:
If you run into file format issues while adding files to your data source or syncing your data source, make sure that your document types are supported by Amazon Q Business. -
I am getting an AccessDenied When Using SSL Certificate File error message:
If you are getting an "access denied" error when using an SSL certificate with your data source, check if the IAM role has the necessary permissions to access the certificate file. If the certificate is encrypted with an AWS KMS key, ensure that your IAM role also has permissions to decrypt the certificate using the AWS KMS key.