-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy path.cursorrules
115 lines (98 loc) · 7.01 KB
/
.cursorrules
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
You are an expert in Python, FastAPI, microservices architecture, and serverless environments.
Advanced Principles
- Design services to be stateless; leverage external storage and caches (e.g., Redis) for state persistence.
- Implement API gateways and reverse proxies (e.g., NGINX, Traefik) for handling traffic to microservices.
- Use circuit breakers and retries for resilient service communication.
- Favor serverless deployment for reduced infrastructure overhead in scalable environments.
- Use asynchronous workers (e.g., Celery, RQ) for handling background tasks efficiently.
Clean Architecture and Domain-Driven Design (DDD)
- Enforce **Clean Architecture principles** by separating concerns into **layers (Domain, Application, Infrastructure, Presentation)**.
- Use **Dependency Inversion** to abstract external providers (DB, cache, third-party APIs).
- Ensure the **Domain Layer remains pure**, containing business rules without dependencies on external systems.
- Apply **Domain-Driven Design (DDD)** as a **core** principle, ensuring entities, value objects, and aggregates are well-defined.
- Avoid business logic in controllers or infrastructure layers—use **Application Services** for orchestration.
CQRS (Command Query Responsibility Segregation)
- **Separate read and write operations** to reduce coupling and optimize performance.
- Implement **Query Handlers** for efficient data retrieval.
- Use **Command Handlers** to process changes without affecting read-side models.
- Consider **Event Sourcing** where applicable to maintain an audit log of state changes.
Microservices and API Gateway Integration
- Integrate FastAPI services with API Gateway solutions like Kong or AWS API Gateway.
- Use API Gateway for rate limiting, request transformation, and security filtering.
- Design APIs with clear separation of concerns to align with microservices principles.
- Implement inter-service communication using message brokers (e.g., RabbitMQ, Kafka) for event-driven architectures.
Serverless and Cloud-Native Patterns
- Optimize FastAPI apps for serverless environments (e.g., AWS Lambda, Azure Functions) by minimizing cold start times.
- Package FastAPI applications using lightweight containers or as a standalone binary for deployment in serverless setups.
- Use managed services (e.g., AWS DynamoDB, Azure Cosmos DB) for scaling databases without operational overhead.
- Implement automatic scaling with serverless functions to handle variable loads effectively.
Advanced Middleware and Security
- Implement custom middleware for detailed logging, tracing, and monitoring of API requests.
- Use OpenTelemetry or similar libraries for distributed tracing in microservices architectures.
- Apply security best practices: OAuth2 for secure API access, rate limiting, and DDoS protection.
- Use security headers (e.g., CORS, CSP) and implement content validation using tools like OWASP Zap.
Optimizing for Performance and Scalability
- Leverage FastAPI's async capabilities for handling large volumes of simultaneous connections efficiently.
- Optimize backend services for high throughput and low latency; use databases optimized for read-heavy workloads (e.g., Elasticsearch).
- Use caching layers (e.g., Redis, Memcached) to reduce load on primary databases and improve API response times.
- Apply load balancing and service mesh technologies (e.g., Istio, Linkerd) for better service-to-service communication and fault tolerance.
Monitoring and Logging
- Use Prometheus and Grafana for monitoring FastAPI applications and setting up alerts.
- Implement structured logging for better log analysis and observability.
- Integrate with centralized logging systems (e.g., ELK Stack, AWS CloudWatch) for aggregated logging and monitoring.
Key Conventions
1. Follow **microservices principles** for building scalable and maintainable services.
2. Optimize FastAPI applications for **serverless and cloud-native deployments**.
3. Apply **Clean Architecture, DDD, and CQRS** to ensure **scalability, maintainability, and business logic purity**.
4. Use **security, monitoring, and performance optimization** techniques to build robust, performant APIs.
5. **Keep It Simple**
Above all, prioritize simplicity and only apply the rules necessary for the use case.
- *Example:* When you might be tempted to set up a complex event-driven pipeline, first consider whether a simpler, synchronous solution meets the immediate needs.
6. **Reasoning Approach**
Avoid starting with a fixed conclusion. Begin with some doubt, explore multiple possibilities,
investigate thoroughly, and only make a final conclusion once sufficient evidence and analysis
have been considered.
7. **@Web Usage**
The model is encouraged to use any relevant web references discovered (via `@Web`) at any time
it finds fit, without waiting for explicit user permission. This helps enrich responses with
properly cited sources.
Refer to FastAPI, microservices, serverless, and Clean Architecture documentation for best practices and advanced usage patterns.
PyVisionAI Project-Specific Guidelines
- Document Processing Architecture
- Maintain clear separation between document processors (PDF, DOCX, PPTX, HTML)
- Use Strategy pattern for different extraction methods (text_and_images, page_as_image)
- Implement Factory pattern for Vision Model providers (OpenAI, Ollama)
- Keep extraction logic independent of vision model implementation
- Vision Model Integration
- Abstract vision model interfaces through base classes
- Support both cloud (OpenAI) and local (Ollama) models
- Implement proper retry mechanisms for API calls
- Handle model-specific configuration and requirements
- Performance Optimization
- Use parallel processing for document extraction where appropriate
- Implement proper resource cleanup for large documents
- Optimize image processing for memory efficiency
- Cache processed results when beneficial
- CLI Design
- Follow consistent parameter naming across commands
- Provide clear, helpful error messages
- Support both simple and advanced usage patterns
- Maintain backward compatibility in parameter changes
- Testing Strategy
- Use fixtures for document processing tests
- Mock external API calls in vision model tests
- Implement proper cleanup of test artifacts
- Maintain high test coverage (>80%)
- Package Distribution
- Support multiple installation methods (pip, poetry, homebrew)
- Properly handle system dependencies
- Maintain clear version compatibility requirements
- Document all installation methods thoroughly
Test-Driven Development (TDD) Rules
- **NEVER modify production code while writing or fixing tests**
- Tests must be written to match the current production behavior
- If tests fail, document the failures and create separate tasks to fix production code
- Follow strict Red-Green-Refactor cycle: write failing test first, then fix production code
- Keep test code and production code changes in separate commits
- Test files should mirror the structure of the production code
- Tests should be independent and not rely on other tests' state