Skip to content

Quickly build a document processing agent with AWS Bedrock agents

Notifications You must be signed in to change notification settings

dashapetr/document-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Building an Amazon Bedrock Agent for Document Processing

Welcome to building document assistant using Amazon Bedrock Agents.

Agent Architecture Diagram

What Document Processing Agent Can Do

Basic tasks:

  1. Summarize: given a document, return its summary
  2. Retrieve info: question-answer based on a document
  3. Analyze data: simple quantitative data analysis
  4. Generate plot: based on the document data
  5. Make changes: adjust content formats

Alter dates

Answer questions

Advanced tasks:

  1. AWS services: integrate with various services, ex. Textract, Translate, etc.
  2. Complex pipelines: accomplish multi-step changes and tasks
  3. Knowledge bases: RAG-based search across documents

Complex task

Query KB

What is Bedrock Agent?

Agents for Amazon Bedrock helps you accelerate generative artificial intelligence (AI) application development by orchestrating multistep tasks. They can make different API calls. Agents extend FMs to understand user requests, break down complex tasks into multiple steps, carry on a conversation to collect additional information, and take actions to fulfill the request.

Let's build!

1️⃣ Step 1: Create Bedrock agent

Prerequisites: we will use Amazon Nova Lite v1 as model, make sure that you have access to it in your account.

  1. Go to Bedrock console, select Agents in the left navigation panel, then click on the Create Agent button
  2. Provide agent-for-document-processing as Agent name, provide a description (optional). Click on the Create button.

Create agent

  1. You can now open the Agent builder, the place where you can access and edit the overall configuration of an agent. We will select Amazon Nova Lite v1 as model (Pricing) [Note: If you face any issues with the model, try change it to Claude family models];

Paste the following text as Instructions for the Agent:

You are a document processing agent skilled at extracting key information from documents, translating content, summarizing text, and manipulating data formats. 
Your tasks include finding key points in documents, locating documents in Amazon S3 and querying them, altering date formats in Excel files, summarizing long documents, 
parsing PDFs with Amazon Textract and saving results to Amazon DynamoDB, and translating documents to required languages. 
Use your capabilities to assist users with efficiently processing and analyzing document data.
  1. In Additional settings, select Enabled for Code Interpreter

Edit agent

Leave all the rest as default. Then, choose Save and Exit to update the configuration of the agent. In the test chat, click Prepare to update the agent.

Now, you can test the agent! πŸŽ‰

πŸ‘‰ Append sample-company-report.docx (can be found inside example-documents) and ask:

what are the next crucial action items?

πŸ‘‰ Append sales_data.xlsx (can be found inside example-documents) TO THE CODE EDITOR and ask:

alter sales dates to american format: instead of using YYYY-MM-DD, use YYYY-DD-MM, output updated file

2️⃣ Step 2: Add action group

Prepare testing files in S3 bucket

  1. Go to S3 console and click on the Create bucket button
  2. Give your bucket a unique name, for example bucket-for-documents-NUMBER. Click Create bucket.
  3. Select your created bucket, click on the Upload button.
  4. You can drag and drop invoice.pdf and contrato-servicios.pdf files (can be found inside example-documents) and then click on the Upload button

Prepare data in DynamoDB table

  1. Navigate to the DynamoDB console and click on Create table.
  2. Enter invoices-parsed as Table name and doc-name as Partition key. Click on the Create table button.
  3. Once the table is created, we don't need to add values. Agent will add values there.

Let's create one more table:

  1. Go to Tables and click on Create table.
  2. Enter foreign-docs as Table name and doc-name as Partition key. Click on the Create table button.
  3. Click on the created table, then click on the Actions button and select Create item from the drop-down list.
  4. Paste contrato-servicios as doc-name
  5. Click on the Add new attribute button and select String. Paste document_path as Attribute name and s3://bucket-for-documents-NUMBER/contrato-servicios.pdf as value.
  6. Click on the Add new attribute button and select String. Paste language as Attribute name and es as value.
  7. Finally, click on the Create item button.

create item

Prepare Lambda function

Lambda Function will manage the logic required for complex actions. Code contains set of APIs that Bedrock agent will call. The function will then format the response and send it back to the agent.

  1. Navigate to the Lambda Console and click on Create function button.
  2. Paste document-agent-action-group-lambda as a function name and choose Python 3.11 as a runtime
  3. Click on Create function button in the bottom of the page

Update permissions:

  1. Once the function is created, click on the Configuration Tab in the same page and Choose Permissions from the left side panel
  2. Click on Add permissions button in Resource-based policy statement section to provide the permission to invoke lambda functions from Bedrock
  3. Select AWS service, Other as Service, provide any valid Statement Id. Provide bedrock.amazonaws.com as Principal, your agent ARN as Source ARN, and Action as lambda:InvokeFunction. Click Save.

update permissions

  1. Under the Execution role, click on the role link. Under Permission policies, click on Add permissions button, select Attach policies.
  2. Search for TranslateReadOnly, then click on Add permissions. Now Lambda can call Translate.
  3. Under Permission policies, click on Add permissions button, select Create inline policy.
  4. Click on JSON, then paste JSON from the lambda-policy/DocumentLambdaServicePolicy.json.
  5. Click Next, name the policy DocumentLambdaServicePolicy and click on Create policy. Now your Lambda has required access to Textract, S3, DynamoDB.

Adjust the timeout:

  1. Choose General configuration from the left side panel
  2. Click on the Edit button
  3. Modify Timeout by increasing it to 3 minutes 30 seconds to make sure your Lambda won't fail while waiting for document parsing.
  4. Click on the Save button

Add code:

  1. Now you can go back to the Lambda function, click on the Code tab in the same page
  2. Copy the code from lambda_function.py and replace the code in code editor.
  3. Click on the Deploy button.

Test your Lambda:

Since our Lambda contains a set of APIs, you may want to create several test events to test each API.

  1. Click on the Test tab near the top of the page.
  2. Fill in Event name: extract-text
  3. Paste the code from lambda-payloads/extract-text.json in Event JSON window. DON'T FORGET TO CHANGE S3 BUCKET NAME! This will be a test event for the extract_text_from_pdf API that matches how the Agent will send a request.
  4. Click on Save and then Test to execute the Lambda function. You should see the results of the successful function invocation.

Lambda test results

  1. Click on Create new event button and repeat steps 2-4 to add one more test event (you can find JSON payload in the lambda-payloads/translate-test.json)

CREATE ACTION GROUP

An action group is a toolbox that defines actions the agent can help the user perform.

One agent can have up to 20 action groups - see Bedrock quotas.

To create Action group: in the Agent builder choose Add in the Action groups section.

Use extract-and-translate-action-group as Action group name.

Use the following description:

The action group contains tools that parse PDFs with Amazon Textract and save results to Amazon DynamoDB, and translate documents to required languages.

Add action group

In the Action group type section, select Define with function details.

In the Action group invocation section, select Select an existing Lambda function and select document-agent-action-group-lambda as Lambda function. invocation action group

We will add 2 action group functions.

Click on the Add action group function, then select JSON Editor and paste the following:

{
  "name": "extract_text_from_pdf",
  "description": "Parse a PDF with Amazon Textract and save results to DynamoDB",
  "parameters": {
    "s3_path": {
      "description": "The path to the PDF file",
      "required": "True",
      "type": "String"
    },
    "table_name": {
      "description": "The DynamoDB table name to save results to",
      "required": "True",
      "type": "String"
    }
  },
  "requireConfirmation": "DISABLED"
}

json function action group

Next, click on the Add action group function, then select JSON Editor and paste the following:

{
  "name": "translate_document",
  "description": "Retrieve a PDF document by its name, check its language, translate if needed, save and update DynamoDB.",
  "parameters": {
    "document_name": {
      "description": "PDF document name",
      "required": "True",
      "type": "String"
    },
    "table_name": {
      "description": "The DynamoDB table name to save results to",
      "required": "True",
      "type": "String"
    }
  },
  "requireConfirmation": "DISABLED"
}

Click on Save and exit to exit from Action group editing.

Next, click on Save and exit to exit from the Agent builder.

In the test chat, click Prepare to update the agent.

Now, you can test the agent! πŸŽ‰

πŸ‘‰ Go to the chat with agent and ask:

translate the contrato-servicios document, you can find it in foreign-docs table

πŸ‘‰ Go to the chat with agent and ask (DON'T FORGET TO CHANGE THE BUCKET NAME!):

extract text from file s3://bucket-for-documents-NUMBER/invoice.pdf and update the invoices-parsed table

3️⃣ Step 3: Add knowledge base

Prepare Knowledge base files in S3 bucket

  1. Go to S3 console and click on the Create bucket button
  2. Give your bucket a unique name, for example knowledge-base-NUMBER. Click Create bucket.
  3. Select your created bucket, click on the Upload button.
  4. You can drag and drop how-many-days-to-pay.docx and po-creation.docx files (can be found inside knowledge-base-docs) and then click on the Upload button

Create Knowledge base

  1. Go to Bedrock console, choose Builder tools -> Knowledge Bases from the navigation pane.
  2. Click on Create and select Knowledge base with vector store
  3. Specify knowledge-base-internal-docs as name, select Amazon S3 as data source, click Next.

KB source

  1. Add 2 data sources: these should be .docx files that you uploaded to the knowledge-base-NUMBER S3 bucket. Specify S3 URI locations for each of them, leave the rest as default. Click Next

KB data source

  1. Specify any embedding model (make sure you have enabled access to it). Select Quick create a new vector store and Amazon OpenSearch Serverless, click Next.
  2. After you review and create your knowledge base, make sure to Sync data sources and select a model to test.
  3. In the chat, write: days to pay for US to make sure your Knowledge base works.

KB test

Add Knowledge base to the agent

  1. Go back to your agent, open Agent builder. Under the Knowledge bases, click Add
  2. Select knowledge-base-internal-docs and provide for Instructions:
Search in knowledge base information that is specific to a company or certain country
  1. Click Add. You can add up to 2 Knowledge bases per agent.
  2. In the test chat, click Prepare to update the agent.

Now, you can test the agent! πŸŽ‰

πŸ‘‰ Go to the chat with agent and ask:

purchase order number generation

Possible errors

1. Incorrectly formatted overridden prompt:

Incorrect prompt

To resolve: Enable Code Editor

2. Access denied while invoking Lambda function:

Access denied

To resolve: add a resource-based policy statement on the Lambda

3. Error processing the Lambda response:

lambda response

To resolve: Check Lambda output format

About

Quickly build a document processing agent with AWS Bedrock agents

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages