Skip to content

patrickchugh/terravision-aibackend

Repository files navigation

Terravision AI Backend

This is the LLM backend for TerraVision used to refine the diagrams using AI. It is built on AWS-based serverless AI backend infrastructure defined by Terraform. The API Gateway Service then exposes a REST API that provides a streaming chat response powered by AWS Bedrock (Claude).

Architecture

Architecture Diagram

Overview

This project deploys a scalable, serverless backend for AI-powered chat applications. It proxies requests to AWS Bedrock's Claude models and streams responses back to clients in real-time, with built-in rate limiting and cost monitoring.

AWS Services

Service Purpose
AWS Bedrock LLM inference (Claude model)
Lambda Serverless request handler (Node.js 20.x)
API Gateway REST API endpoint with streaming
DynamoDB Per-client rate limiting
CloudWatch Logging, metrics, and alarms
IAM Role-based access control

Project Structure

.
├── main.tf                # Terraform block and provider
├── backend.tf             # S3 remote state backend
├── variables.tf           # Input variables
├── lambda.tf              # Lambda function, IAM, permissions
├── api_gateway.tf         # API Gateway resources
├── cloudwatch.tf          # Log group and metric alarms
├── dynamodb.tf            # DynamoDB rate limiting table
├── outputs.tf             # Output values
├── terraform.tfvars       # Variable values (model ID)
├── aws-sso.sh             # AWS SSO credential helper
├── testbackend.py         # Python test client
├── test_payload.json      # Sample request payload
├── architecture.dot.png   # Architecture diagram
├── lambda/
│   ├── index.mjs          # Lambda handler (Node.js)
│   ├── package.json       # Node.js dependencies
│   └── package-lock.json  # Dependency lock
└── LICENSE                # MIT License

Prerequisites

  • Terraform >= 1.10 (for native S3 state locking)
  • AWS CLI configured with appropriate credentials
  • AWS account with Bedrock model access enabled

Configuration

Variable Default Description
aws_region us-east-1 AWS deployment region
project_name llm-app Resource naming prefix
bedrock_model_id (required) Bedrock model ID to invoke
rate_limit_per_hour 100 Max requests per client per hour
cost_alert_threshold 50 Cost alert threshold (USD)

Deployment

# Export AWS SSO credentials
source ./aws-sso.sh

# Create the S3 bucket for remote state (first time only)
aws s3api create-bucket --bucket llm-app-terraform-state --region us-east-1
aws s3api put-bucket-versioning --bucket llm-app-terraform-state \
  --versioning-configuration Status=Enabled

# Initialize Terraform
terraform init

# Review planned changes
terraform plan

# Deploy infrastructure
terraform apply

Usage

API Request

Send a POST request to the /chat endpoint:

{
  "messages": [
    {"role": "user", "content": "Your prompt here"}
  ],
  "max_tokens": 1000
}

Testing

python testbackend.py

Outputs

Output Description
api_endpoint REST API Gateway URL
function_url Direct Lambda Function URL
api_id API Gateway REST API ID
lambda_function_name Lambda function name
dynamodb_table_name Usage tracking table name

Features

  • Response Streaming - Real-time streamed responses for low-latency chat UX
  • Rate Limiting - Per-client request throttling via source IP tracking with automatic TTL cleanup
  • Cost Monitoring - CloudWatch alarms for cost and error thresholds
  • Dual Access - Both API Gateway and direct Lambda Function URL endpoints

License

MIT License - see LICENSE for details.

About

Backend Infrastructure for TerraVision AI refinement

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •