Skip to content

Commit 56f5e43

Browse files
committed
missed new files
1 parent 3760498 commit 56f5e43

File tree

5 files changed

+561
-0
lines changed

5 files changed

+561
-0
lines changed
Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
# OpenTelemetry Instrumentation
2+
3+
This Lambda function is instrumented with OpenTelemetry to trace performance and identify heavy operations.
4+
5+
## What's Instrumented
6+
7+
The following operations are traced:
8+
9+
### 1. **Initialization (Cold Start)**
10+
- `judge_initialization` - Total time to initialize the Judge
11+
- `judge_init` - Judge class initialization
12+
- `load_settings` - Loading settings
13+
- `get_llm` - Loading the LLM model
14+
- `import_google_genai` - Importing Google GenAI library
15+
- `instantiate_google_llm` - Creating LLM instance
16+
- `create_evaluator` - Creating evaluator
17+
- `get_embed_model` - Loading embedding model
18+
- `import_google_genai_embedding` - Importing embedding library
19+
- `instantiate_google_embedding` - Creating embedding instance
20+
- `instantiate_evaluator` - Creating evaluator instance
21+
- `evaluator_init` - Evaluator initialization
22+
- `wrap_llm` - Wrapping LLM for Ragas
23+
- `wrap_embedder` - Wrapping embedder for Ragas
24+
- `init_metrics` - Initializing evaluation metrics
25+
- `init_response_relevancy`
26+
- `init_context_precision`
27+
- `init_faithfulness`
28+
29+
### 2. **Request Processing**
30+
- `lambda_handler` - Main handler execution
31+
- `process_record_{idx}` - Processing each SQS record
32+
- `judge_evaluate` - High-level evaluation call
33+
- `judge_evaluate_method` - Evaluation logic
34+
- `condense_query` - Query condensation (if messages present)
35+
- `messages_to_chathistory` - Converting messages to chat history
36+
- `llm_acomplete` - LLM completion call
37+
- `evaluator_evaluate` - Running evaluation
38+
- `evaluator_evaluate_method` - Evaluation logic
39+
- `create_sample` - Creating evaluation sample
40+
- `ragas_evaluate` - Running Ragas metrics
41+
- `process_scores` - Processing results
42+
- `add_langfuse_scores` - Sending scores to Langfuse
43+
44+
## Configuration
45+
46+
Set environment variables to control trace export:
47+
48+
### Console Output (Development)
49+
```bash
50+
export OTEL_TRACES_EXPORTER=console
51+
export OTEL_SERVICE_NAME=chatbot-evaluate
52+
```
53+
54+
### OTLP Exporter (Production)
55+
For AWS X-Ray, Jaeger, or other OTLP-compatible backends:
56+
```bash
57+
export OTEL_TRACES_EXPORTER=otlp
58+
export OTEL_EXPORTER_OTLP_ENDPOINT=http://your-collector:4317
59+
export OTEL_SERVICE_NAME=chatbot-evaluate
60+
export OTEL_SERVICE_VERSION=0.1.0
61+
```
62+
63+
### Disable Tracing
64+
```bash
65+
export OTEL_TRACES_EXPORTER=none
66+
```
67+
68+
## Viewing Traces
69+
70+
### Console Output
71+
When using `console` exporter, traces will appear in CloudWatch Logs. Look for JSON output like:
72+
```json
73+
{
74+
"name": "judge_initialization",
75+
"context": {...},
76+
"kind": "SpanKind.INTERNAL",
77+
"parent_id": null,
78+
"start_time": "...",
79+
"end_time": "...",
80+
"attributes": {...}
81+
}
82+
```
83+
84+
### AWS X-Ray Integration
85+
To send traces to AWS X-Ray, you can use the AWS Distro for OpenTelemetry (ADOT) Lambda layer:
86+
87+
1. Add the ADOT Lambda layer to your function
88+
2. Set environment variables:
89+
```bash
90+
OTEL_TRACES_EXPORTER=otlp
91+
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
92+
AWS_LAMBDA_EXEC_WRAPPER=/opt/otel-instrument
93+
```
94+
95+
### Other Backends
96+
For Jaeger, Zipkin, Honeycomb, etc., set the appropriate OTLP endpoint.
97+
98+
## Analyzing Performance
99+
100+
### Key Metrics to Monitor
101+
102+
1. **Cold Start Time** - `judge_initialization` span
103+
- Check which component takes longest:
104+
- LLM model loading (`get_llm`)
105+
- Embedding model loading (`get_embed_model`)
106+
- Metric initialization (`init_metrics`)
107+
108+
2. **Request Processing** - `lambda_handler` span
109+
- Per-record processing time
110+
- LLM calls (`llm_acomplete`, `ragas_evaluate`)
111+
- External service calls (Langfuse)
112+
113+
3. **Library Import Time** - Look for `import_*` spans
114+
- `import_google_genai`
115+
- `import_google_genai_embedding`
116+
117+
### Example Analysis
118+
119+
If you see high cold start times:
120+
- Check `import_google_genai` and `import_google_genai_embedding` spans - these imports can be slow
121+
- Check `instantiate_google_llm` and `instantiate_google_embedding` - model initialization may be heavy
122+
- Check `init_metrics` - Ragas metric initialization can take time
123+
124+
If you see high request processing times:
125+
- Check `llm_acomplete` - LLM API calls
126+
- Check `ragas_evaluate` - Evaluation can make multiple LLM calls
127+
- Check `add_langfuse_scores` - Network calls to external service
128+
129+
## Span Attributes
130+
131+
Each span includes contextual attributes:
132+
133+
- **lambda_handler**: `event.records_count`
134+
- **process_record**: `trace_id`, `has_messages`, `contexts_count`
135+
- **judge_evaluate_method**: `trace_id`, `has_messages`, `contexts_count`
136+
- **evaluator_evaluate_method**: `contexts_count`, `query_length`, `response_length`
137+
- **get_llm**: `provider`, `model_id`
138+
- **get_embed_model**: `provider`, `model_id`
139+
140+
Use these attributes to correlate performance with request characteristics.
141+
142+
## Troubleshooting
143+
144+
### No traces appearing
145+
1. Check that `OTEL_TRACES_EXPORTER` is set correctly
146+
2. Verify CloudWatch Logs for any OpenTelemetry errors
147+
3. Check that spans are being created (add debug logging)
148+
149+
### High overhead
150+
1. Consider sampling in production (configure TracerProvider with sampler)
151+
2. Use BatchSpanProcessor (already configured) instead of SimpleSpanProcessor
152+
3. Reduce instrumentation granularity if needed
153+
154+
### Missing spans
155+
1. Ensure all code paths create spans
156+
2. Check for exceptions that might prevent span completion
157+
3. Verify async operations are properly traced
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
================================================================================
2+
OpenTelemetry Instrumentation - Summary
3+
================================================================================
4+
5+
βœ… COMPLETED: Your AWS Lambda function is now fully instrumented with OpenTelemetry
6+
7+
πŸ“Š WHAT WAS INSTRUMENTED:
8+
9+
1. Cold Start Initialization (CRITICAL for Lambda performance)
10+
- Judge class instantiation
11+
- LLM model loading (Google GenAI)
12+
- Embedding model loading
13+
- Ragas evaluation metrics initialization
14+
- Library imports (often the slowest part!)
15+
16+
2. Request Processing
17+
- SQS record handling
18+
- Query evaluation
19+
- LLM API calls
20+
- Ragas metric execution
21+
- Langfuse score reporting
22+
23+
πŸ“ FILES MODIFIED:
24+
25+
src/lambda_function.py ✏️ Added OpenTelemetry setup and handler tracing
26+
src/modules/judge.py ✏️ Added Judge class and evaluation tracing
27+
src/modules/models.py ✏️ Added model loading tracing
28+
src/modules/evaluator.py ✏️ Added Evaluator class tracing
29+
30+
πŸ“ FILES CREATED:
31+
32+
src/otel_config.py ✨ OpenTelemetry configuration
33+
OTEL_INSTRUMENTATION.md ✨ Comprehensive documentation
34+
QUICKSTART_OTEL.md ✨ Quick start guide
35+
test_otel.py ✨ Test script
36+
OTEL_SUMMARY.txt ✨ This file
37+
38+
🎯 KEY TRACES TO MONITOR:
39+
40+
Cold Start Bottlenecks:
41+
β”œβ”€ import_google_genai ⏱️ Library import time
42+
β”œβ”€ import_google_genai_embedding ⏱️ Embedding library import
43+
β”œβ”€ instantiate_google_llm ⏱️ LLM initialization
44+
β”œβ”€ instantiate_google_embedding ⏱️ Embedding initialization
45+
└─ init_metrics ⏱️ Ragas metrics setup
46+
47+
Request Processing Bottlenecks:
48+
β”œβ”€ llm_acomplete ⏱️ LLM API calls
49+
β”œβ”€ ragas_evaluate ⏱️ Evaluation execution
50+
└─ add_langfuse_scores ⏱️ External service calls
51+
52+
πŸš€ HOW TO USE:
53+
54+
1. Local Testing:
55+
$ export OTEL_TRACES_EXPORTER=console
56+
$ python3 test_otel.py
57+
58+
2. Lambda Deployment:
59+
Set environment variables:
60+
- OTEL_TRACES_EXPORTER=console
61+
- OTEL_SERVICE_NAME=chatbot-evaluate
62+
63+
3. AWS X-Ray Integration:
64+
- Add ADOT Lambda Layer
65+
- Set OTEL_TRACES_EXPORTER=otlp
66+
- Set OTEL_EXPORTER_OTLP_ENDPOINT=localhost:4317
67+
68+
πŸ“ˆ EXPECTED INSIGHTS:
69+
70+
You will now be able to see:
71+
βœ“ Exact time spent in each initialization step
72+
βœ“ Which library imports are slowest
73+
βœ“ LLM API call latencies
74+
βœ“ Evaluation metric execution time
75+
βœ“ Total cold start vs warm start performance
76+
77+
πŸ’‘ OPTIMIZATION RECOMMENDATIONS:
78+
79+
Based on traces, you can:
80+
β€’ Use Lambda Provisioned Concurrency for critical workloads
81+
β€’ Move slow imports to Lambda Layers
82+
β€’ Cache model instances if reinitializing
83+
β€’ Increase Lambda memory allocation (more CPU = faster init)
84+
β€’ Lazy load heavy libraries only when needed
85+
86+
πŸ“š DOCUMENTATION:
87+
88+
- QUICKSTART_OTEL.md β†’ Quick start and examples
89+
- OTEL_INSTRUMENTATION.md β†’ Comprehensive documentation
90+
- test_otel.py β†’ Working test example
91+
92+
================================================================================

0 commit comments

Comments
Β (0)