Production-ready serverless platform for AI-powered code analysis using AWS Bedrock and DevOps best practices
A sophisticated CopyGuard platform that analyzes code snippets to detect AI-generated content using Amazon Bedrock's Claude v2 model. Built with enterprise-grade DevOps practices including Infrastructure as Code, comprehensive monitoring, and production-ready security.
- ๐ Blog: Building CopyGuard on AWS
- ๐ฅ Demo Video: Watch on Vimeo
- ๐ง AI-Powered Analysis: Leverages Amazon Bedrock's Claude v2 for intelligent code detection
- โ๏ธ Serverless Architecture: Cost-effective, auto-scaling infrastructure
- ๐ Enterprise Security: IAM roles, API authentication, and secure access controls
- ๐ Production Monitoring: CloudWatch metrics, alarms, and Grafana dashboards
- ๐ Global Distribution: CloudFront CDN for optimal performance
- ๐ Real-time Metrics: Custom CloudWatch metrics for confidence scores and latency
- ๐พ Persistent Storage: S3 integration for analysis results and audit trails
- AWS Lambda: Serverless compute with Python 3.11 runtime
- Amazon Bedrock: GenAI foundation models (Claude v2)
- API Gateway v2: HTTP API with CORS support
- CloudFront: Global CDN with Origin Access Control
- S3: Static website hosting and results storage
- Terraform: Infrastructure as Code
- CloudWatch: Logging, metrics, and alerting
- Amazon Managed Grafana: Advanced dashboards with AWS SSO
- IAM: Least-privilege access control
- API Key Authentication: Secure endpoint access
- Python 3.11: Lambda runtime with boto3 SDK
- Regular Expressions: Advanced response parsing
- Error Handling: Production-ready exception management
- Intelligent Response Parsing: Advanced regex patterns for confidence score extraction
- Multi-pattern Detection: Handles various AI detection scenarios
- Custom Metrics: Real-time CloudWatch metrics for monitoring
- S3 Integration: Automatic result storage with timestamps
- Error Handling: Comprehensive exception management
- Performance Tracking: Latency metrics and optimization
- Modular Design: Reusable components and random suffixes
- Security First: IAM policies with least privilege
- Monitoring Built-in: CloudWatch alarms and log groups
- Cost Optimized: Serverless architecture with proper timeouts
- ConfidenceScore: AI detection confidence percentage
- IsAIGenerated: Binary classification results
- LatencyMs: Response time performance
- Lambda Errors: Automated error alerting
- 60-day log retention: Compliance and debugging
- Error threshold alarms: Proactive issue detection
- Grafana dashboards: Advanced visualization
- S3 audit trail: Complete analysis history
- AWS CLI configured with appropriate permissions
- Terraform >= 1.0
- Python 3.11
- AWS Bedrock model access enabled
-
Clone the repository
git clone https://github.com/Yashmaini30/CopyGuard cd CopyGuard -
Configure variables
cp terraform.tfvars.example terraform.tfvars # Edit terraform.tfvars with your specific values as directed -
Deploy infrastructure
terraform init terraform plan terraform apply
-
Access the application
- Frontend: CloudFront distribution URL
- API: API Gateway endpoint URL
curl -X POST https://your-api-endpoint/detect \
-H "Content-Type: application/json" \
-H "x-api-key: your-secret-key" \
-d '{
"code": "def fibonacci(n): return n if n <= 1 else fibonacci(n-1) + fibonacci(n-2)"
}'Response:
{
"result": {
"label": "Human-written",
"confidence": 85,
"raw": "This code appears to be human-written with 85% confidence..."
},
"s3_key": "results/2024-01-15T10:30:00.000Z_abc123.json"
}- Lambda: ~$0.20 (compute time)
- API Gateway: ~$3.50 (per million requests)
- Bedrock: ~$15.00 (Claude v2 model inference)
- S3: ~$0.05 (storage and requests)
- CloudWatch: ~$2.00 (logs and metrics)
- CloudFront: ~$1.00 (data transfer)
- Total: ~$22/month for 1K requests
- 10K requests: ~$180/month
- 100K requests: ~$1,650/month
- Cost per request: ~$0.016
- API key authentication
- CORS configuration
- Rate limiting capabilities
- IAM roles with least privilege
- S3 bucket policies
- VPC endpoints (optional)
- CloudTrail logging
- Encrypted data in transit
- S3 server-side encryption
- No sensitive data in logs
- Average latency: <2 seconds
- P95 latency: <5 seconds
- Timeout: 30 seconds maximum
- Connection pooling for AWS services
- Efficient regex patterns
- Minimal cold start impact
- Optimized Bedrock model parameters
- Multi-model Support: GPT-4, Llama 2, Claude 3
- Batch Processing: Analyze multiple files
- CI/CD Pipeline: GitHub Actions deployment
- Advanced Analytics: ML-powered insights
- Rate Limiting: API throttling implementation
- Caching Layer: Redis for frequent requests
- User Authentication: AWS Cognito integration
- Usage Analytics: Detailed reporting dashboard
- API Versioning: Backward compatibility
- Webhook Support: Real-time notifications
โโโ lambda/
โ โโโ dependency/ # Lambda dependencies
โ โโโ code_detector.zip # Packaged Lambda function
โ โโโ handler.py # Lambda function code
โ โโโ requirements.txt # Python dependencies
|
โโโ .gitignore # Git ignore rules
โโโ architecture.html # Architecture documentation
โโโ index.html # Frontend web interface
โโโ main.tf # Terraform main configuration
โโโ outputs.tf # Terraform output values
โโโ README.md # This file
โโโ script.js # Frontend JavaScript logic
โโโ styles.css # Frontend styling
โโโ terraform.tfvars # Terraform variables (local config)
โโโ terraform.tfvars.example # Example terraform variables
โโโ variables.tf # Terraform input variables
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
Yash Maini - mainiyash2@.com
Project Link: https://github.com/Yashmaini30/CopyGuard
โญ Star this repository if you found it helpful!
