data_sources
Creates, updates, deletes or gets a data_source resource or lists data_sources in a region
Overview
| Name | data_sources |
| Type | Resource |
| Description | Definition of AWS::Bedrock::DataSource Resource Type |
| Id | awscc.bedrock.data_sources |
Fields
- get (all properties)
- list (identifiers only)
| Name | Datatype | Description |
|---|---|---|
data_source_configuration | object | Specifies a raw data source location to ingest. |
data_source_id | string | Identifier for a resource. |
description | string | Description of the Resource. |
knowledge_base_id | string | The unique identifier of the knowledge base to which to add the data source. |
data_source_status | string | The status of a data source. |
name | string | The name of the data source. |
server_side_encryption_configuration | object | Contains details about the server-side encryption for the data source. |
vector_ingestion_configuration | object | Details about how to chunk the documents in the data source. A chunk refers to an excerpt from a data source that is returned when the knowledge base that it belongs to is queried. |
data_deletion_policy | string | The deletion policy for the data source. |
created_at | string | The time at which the data source was created. |
updated_at | string | The time at which the knowledge base was last updated. |
failure_reasons | array | The details of the failure reasons related to the data source. |
region | string | AWS region. |
| Name | Datatype | Description |
|---|---|---|
data_source_id | string | Identifier for a resource. |
knowledge_base_id | string | The unique identifier of the knowledge base to which to add the data source. |
region | string | AWS region. |
For more information, see AWS::Bedrock::DataSource.
Methods
| Name | Resource | Accessible by | Required Params |
|---|---|---|---|
create_resource | data_sources | INSERT | DataSourceConfiguration, Name, KnowledgeBaseId, region |
delete_resource | data_sources | DELETE | Identifier, region |
update_resource | data_sources | UPDATE | Identifier, PatchDocument, region |
list_resources | data_sources_list_only | SELECT | region |
get_resource | data_sources | SELECT | Identifier, region |
SELECT examples
- get (all properties)
- list (identifiers only)
Gets all properties from an individual data_source.
SELECT
region,
data_source_configuration,
data_source_id,
description,
knowledge_base_id,
data_source_status,
name,
server_side_encryption_configuration,
vector_ingestion_configuration,
data_deletion_policy,
created_at,
updated_at,
failure_reasons
FROM awscc.bedrock.data_sources
WHERE
region = '{{ region }}' AND
Identifier = '{{ knowledge_base_id }}|{{ data_source_id }}';
Lists all data_sources in a region.
SELECT
region,
knowledge_base_id,
data_source_id
FROM awscc.bedrock.data_sources_list_only
WHERE
region = '{{ region }}';
INSERT example
Use the following StackQL query and manifest file to create a new data_source resource, using stack-deploy.
- Required Properties
- All Properties
- Manifest
/*+ create */
INSERT INTO awscc.bedrock.data_sources (
DataSourceConfiguration,
KnowledgeBaseId,
Name,
region
)
SELECT
'{{ data_source_configuration }}',
'{{ knowledge_base_id }}',
'{{ name }}',
'{{ region }}'
RETURNING
ErrorCode,
EventTime,
Identifier,
Operation,
OperationStatus,
RequestToken,
ResourceModel,
RetryAfter,
StatusMessage,
TypeName
;
/*+ create */
INSERT INTO awscc.bedrock.data_sources (
DataSourceConfiguration,
Description,
KnowledgeBaseId,
Name,
ServerSideEncryptionConfiguration,
VectorIngestionConfiguration,
DataDeletionPolicy,
region
)
SELECT
'{{ data_source_configuration }}',
'{{ description }}',
'{{ knowledge_base_id }}',
'{{ name }}',
'{{ server_side_encryption_configuration }}',
'{{ vector_ingestion_configuration }}',
'{{ data_deletion_policy }}',
'{{ region }}'
RETURNING
ErrorCode,
EventTime,
Identifier,
Operation,
OperationStatus,
RequestToken,
ResourceModel,
RetryAfter,
StatusMessage,
TypeName
;
version: 1
name: stack name
description: stack description
providers:
- aws
globals:
- name: region
value: '{{ vars.AWS_REGION }}'
resources:
- name: data_source
props:
- name: data_source_configuration
value:
type: '{{ type }}'
s3_configuration:
bucket_arn: '{{ bucket_arn }}'
inclusion_prefixes:
- '{{ inclusion_prefixes[0] }}'
bucket_owner_account_id: '{{ bucket_owner_account_id }}'
confluence_configuration:
source_configuration:
host_url: '{{ host_url }}'
host_type: '{{ host_type }}'
auth_type: '{{ auth_type }}'
credentials_secret_arn: '{{ credentials_secret_arn }}'
crawler_configuration:
filter_configuration:
type: '{{ type }}'
pattern_object_filter:
filters:
- object_type: '{{ object_type }}'
inclusion_filters:
- '{{ inclusion_filters[0] }}'
exclusion_filters: null
salesforce_configuration:
source_configuration:
host_url: '{{ host_url }}'
auth_type: '{{ auth_type }}'
credentials_secret_arn: '{{ credentials_secret_arn }}'
crawler_configuration:
filter_configuration: null
share_point_configuration:
source_configuration:
site_urls:
- '{{ site_urls[0] }}'
host_type: '{{ host_type }}'
auth_type: '{{ auth_type }}'
credentials_secret_arn: '{{ credentials_secret_arn }}'
tenant_id: '{{ tenant_id }}'
domain: '{{ domain }}'
crawler_configuration:
filter_configuration: null
web_configuration:
source_configuration:
url_configuration:
seed_urls:
- url: '{{ url }}'
crawler_configuration:
crawler_limits:
rate_limit: '{{ rate_limit }}'
max_pages: '{{ max_pages }}'
inclusion_filters: null
exclusion_filters: null
scope: '{{ scope }}'
user_agent: '{{ user_agent }}'
user_agent_header: '{{ user_agent_header }}'
- name: description
value: '{{ description }}'
- name: knowledge_base_id
value: '{{ knowledge_base_id }}'
- name: name
value: '{{ name }}'
- name: server_side_encryption_configuration
value:
kms_key_arn: '{{ kms_key_arn }}'
- name: vector_ingestion_configuration
value:
chunking_configuration:
chunking_strategy: '{{ chunking_strategy }}'
fixed_size_chunking_configuration:
max_tokens: '{{ max_tokens }}'
overlap_percentage: '{{ overlap_percentage }}'
hierarchical_chunking_configuration:
level_configurations:
- max_tokens: '{{ max_tokens }}'
overlap_tokens: '{{ overlap_tokens }}'
semantic_chunking_configuration:
breakpoint_percentile_threshold: '{{ breakpoint_percentile_threshold }}'
buffer_size: '{{ buffer_size }}'
max_tokens: '{{ max_tokens }}'
custom_transformation_configuration:
intermediate_storage:
s3_location:
uri: '{{ uri }}'
transformations:
- step_to_apply: '{{ step_to_apply }}'
transformation_function:
transformation_lambda_configuration:
lambda_arn: '{{ lambda_arn }}'
parsing_configuration:
parsing_strategy: '{{ parsing_strategy }}'
bedrock_foundation_model_configuration:
model_arn: '{{ model_arn }}'
parsing_prompt:
parsing_prompt_text: '{{ parsing_prompt_text }}'
parsing_modality: '{{ parsing_modality }}'
bedrock_data_automation_configuration:
parsing_modality: null
context_enrichment_configuration:
type: '{{ type }}'
bedrock_foundation_model_configuration:
enrichment_strategy_configuration:
method: '{{ method }}'
model_arn: null
- name: data_deletion_policy
value: '{{ data_deletion_policy }}'
UPDATE example
Use the following StackQL query and manifest file to update a data_source resource, using stack-deploy.
/*+ update */
UPDATE awscc.bedrock.data_sources
SET PatchDocument = string('{{ {
"Description": description,
"Name": name,
"ServerSideEncryptionConfiguration": server_side_encryption_configuration,
"DataDeletionPolicy": data_deletion_policy
} | generate_patch_document }}')
WHERE
region = '{{ region }}' AND
Identifier = '{{ knowledge_base_id }}|{{ data_source_id }}'
RETURNING
ErrorCode,
EventTime,
Identifier,
Operation,
OperationStatus,
RequestToken,
ResourceModel,
RetryAfter,
StatusMessage,
TypeName
;
DELETE example
/*+ delete */
DELETE FROM awscc.bedrock.data_sources
WHERE
Identifier = '{{ knowledge_base_id }}|{{ data_source_id }}' AND
region = '{{ region }}'
RETURNING
ErrorCode,
EventTime,
Identifier,
Operation,
OperationStatus,
RequestToken,
ResourceModel,
RetryAfter,
StatusMessage,
TypeName
;
Additional Parameters
Mutable resources in the Cloud Control provider support additional optional parameters which can be supplied with INSERT, UPDATE, or DELETE operations. These include:
| Parameter | Description |
|---|---|
ClientToken | A unique identifier to ensure the idempotency of the resource request.This allows the provider to accurately distinguish between retries and new requests.A client token is valid for 36 hours once used. After that, a resource request with the same client token is treated as a new request. If you do not specify a client token, one is generated for inclusion in the request. |
RoleArn | The ARN of the IAM role used to perform this resource operation.The role specified must have the permissions required for this operation.If you do not specify a role, a temporary session is created using your AWS user credentials. |
TypeVersionId | For private resource types, the type version to use in this resource operation.If you do not specify a resource version, the default version is used. |
Permissions
To operate on the data_sources resource, the following permissions are required:
- Create
- Read
- Update
- Delete
- List
bedrock:CreateDataSource,
bedrock:GetDataSource,
bedrock:GetKnowledgeBase,
kms:GenerateDataKey
bedrock:GetDataSource
bedrock:GetDataSource,
bedrock:UpdateDataSource,
kms:GenerateDataKey
bedrock:GetDataSource,
bedrock:DeleteDataSource
bedrock:ListDataSources