Feature Request: Add Amazon Bedrock support for indexing and retrieval

## Summary

Request to add Amazon Bedrock as a supported LLM provider for both the indexing (tree generation) and retrieval phases.

## Motivation

Many enterprise users operate within AWS environments and would benefit from using Amazon Bedrock for:
- **Data residency**: Keep document processing within AWS regions
- **Unified billing**: Consolidate LLM costs under existing AWS accounts
- **Model choice**: Access to Claude (Anthropic), Amazon Nova, Llama, and other models via a single API
- **Enterprise compliance**: Leverage existing AWS security controls and IAM policies

## Current State

- Upstream PageIndex only supports OpenAI for tree generation
- A [community fork](https://github.com/b-d055/PageIndex) adds Bedrock support for queries only
- PR #43 introduces a multi-provider abstraction layer (OpenAI + Gemini) that could be extended

## Proposed Implementation

Extend the `LLMProvider` abstraction (from PR #43) to support Amazon Bedrock:

```python
class BedrockProvider(LLMProvider):
    def __init__(self, model: str, region: str = "us-east-1"):
        self.client = boto3.client('bedrock-runtime', region_name=region)
        self.model = model

    def call(self, prompt: str) -> str:
        response = self.client.converse(
            modelId=self.model,
            messages=[{"role": "user", "content": [{"text": prompt}]}],
            inferenceConfig={"temperature": 0, "maxTokens": 4096}
        )
        return response['output']['message']['content'][0]['text']

    def call_with_finish_reason(self, prompt: str, chat_history=None) -> tuple:
        # Map Bedrock stop reasons to existing format
        # end_turn -> stop, max_tokens -> length
        ...

    async def call_async(self, prompt: str) -> str:
        # Wrap synchronous boto3 in asyncio executor
        loop = asyncio.get_event_loop()
        return await loop.run_in_executor(None, self.call, prompt)
```

## Suggested Bedrock Models

| Model | Model ID | Use Case |
|-------|----------|----------|
| Claude Sonnet 4 | `us.anthropic.claude-sonnet-4-20250514-v1:0` | High accuracy |
| Claude Haiku | `us.anthropic.claude-haiku-4-5-20251001-v1:0` | Cost-efficient |
| Amazon Nova Pro | `us.amazon.nova-pro-v1:0` | AWS-native |
| Amazon Nova Lite | `us.amazon.nova-lite-v1:0` | Fast/cheap |

## Usage Example

```bash
# With Bedrock
python run_pageindex.py --pdf_path doc.pdf --provider bedrock --model us.anthropic.claude-sonnet-4-20250514-v1:0

# With Bedrock (environment-based)
export PAGEINDEX_PROVIDER=bedrock
export AWS_REGION=us-east-1
python run_pageindex.py --pdf_path doc.pdf
```

## Key Implementation Considerations

1. **Dependencies**: Add `boto3` to requirements.txt
2. **Authentication**: Support IAM roles, credentials file, and environment variables
3. **Stop reason mapping**: Bedrock uses `end_turn`/`max_tokens` vs OpenAI's `stop`/`length`
4. **Message format**: Bedrock Converse API uses `{"content": [{"text": "..."}]}` structure
5. **Async support**: boto3 is synchronous; wrap in `asyncio.run_in_executor()`

## Related

- Issue #90: Support custom models
- Issue #27: Ollama support
- PR #43: Multi-provider LLM support (OpenAI + Gemini)
- PR #37: Azure OpenAI support

Happy to contribute a PR if this feature is welcome!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Add Amazon Bedrock support for indexing and retrieval #104

Summary

Motivation

Current State

Proposed Implementation

Suggested Bedrock Models

Usage Example

Key Implementation Considerations

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model	Model ID	Use Case
Claude Sonnet 4	`us.anthropic.claude-sonnet-4-20250514-v1:0`	High accuracy
Claude Haiku	`us.anthropic.claude-haiku-4-5-20251001-v1:0`	Cost-efficient
Amazon Nova Pro	`us.amazon.nova-pro-v1:0`	AWS-native
Amazon Nova Lite	`us.amazon.nova-lite-v1:0`	Fast/cheap

Feature Request: Add Amazon Bedrock support for indexing and retrieval #104

Description

Summary

Motivation

Current State

Proposed Implementation

Suggested Bedrock Models

Usage Example

Key Implementation Considerations

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions