[Enhancement] Add metadata tracking system with latency percentiles and node costs #59
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a comprehensive metadata tracking system for SyGra that automatically captures execution metrics, token usage, costs, and performance data across all LLM calls and workflow executions. The system provides detailed latency statistics (including percentiles), per-node cost tracking, and multi-level metrics aggregation, requiring zero changes to existing code.
Explain the features implemented:
1. Centralized Metadata Collection System
MetadataCollectorfor tracking all execution metrics2. Latency Statistics
3. Per-Node Cost Tracking
calculate_cost()methodtotal_cost_usdandaverage_cost_per_executionper node4. Cost Tracking with LangChain Community Integration (
langchain-community)5. Automatic Tracking Infrastructure
@track_model_requestdecorator for custom model wrappersMetadataTrackingCallbackfor LangChain agent LLM callsBaseNodefor consistent tracking across all node types6. Comprehensive Metrics Tracking
7. Timestamp Synchronization
output_2025-10-30_18-19-07.json->metadata_..._2025-10-30_18-19-07.json8. Toggle Support
--disable_metadataCLI flagcollector.set_enabled(False)9. Supported Models
How to Test the feature
Test 1: Library Usage with Latency Statistics
Expected Result:
test/output_YYYY-MM-DD_HH-MM-SS.jsontest/metadata/metadata_test_metadata_YYYY-MM-DD_HH-MM-SS.jsonTest 2: CLI Usage
Expected Result:
tasks/examples/glaive_code_assistant/metadata/Expected Result:
Screenshots (if applicable)
Metadata File Structure
{ "metadata_version": "1.0.0", "generated_at": "2025-11-05T21:57:10.123456", "execution": { "task_name": "tasks.examples.glaive_code_assistant", "timing": { "start_time": "2025-11-05T21:57:07.899389", "end_time": "2025-11-05T21:57:10.657968", "duration_seconds": 2.759 }, "environment": { "python_version": "3.11.12", "sygra_version": "1.0.0" }, "git": { "commit_hash": "139a535...", "branch": "scratch/metadata", "is_dirty": false } }, "aggregate_statistics": { "tokens": { "total_prompt_tokens": 440, "total_completion_tokens": 920, "total_tokens": 1360 }, "cost": { "total_cost_usd": 0.00062, "average_cost_per_record": 0.000062 }, "requests": { "total_requests": 20, "total_failures": 0, "failure_rate": 0.0 } }, "models": { "gpt-4o-mini": { "model_type": "OpenAI", "performance": { "average_latency_seconds": 3.203, "tokens_per_second": 21.23, "latency_statistics": { "min": 2.105, "max": 4.821, "mean": 3.203, "median": 3.150, "std_dev": 0.652, "p50": 3.150, "p95": 4.512, "p99": 4.759 } }, "cost": { "total_cost_usd": 0.00062, "average_cost_per_request": 0.000031 } } }, "nodes": { "summarizer": { "node_name": "summarizer", "node_type": "llm", "model_name": "gpt-4o-mini", "total_executions": 10, "latency_statistics": { "min": 2.105, "max": 4.821, "mean": 3.203, "median": 3.150, "std_dev": 0.652, "p50": 3.150, "p95": 4.512, "p99": 4.759 }, "cost": { "total_cost_usd": 0.00031, "average_cost_per_execution": 0.000031 }, "token_statistics": { "total_prompt_tokens": 220, "total_completion_tokens": 460, "total_tokens": 680 } } } }Checklist
Breaking Changes
None. This is a purely additive feature with full backward compatibility. The feature works automatically with all existing code.