This project provides a FastAPI application that serves as a tracing and monitoring solution for Large Language Models (LLMs) on Google Cloud Platform (GCP). It integrates with OpenTelemetry to collect and export telemetry data, enabling users to monitor LLM performance and behavior effectively. The focus is on tracing LLM requests, responses, and their associated metadata. One FastAPI endpoint is showcasing a little agentic functionality with langchain and vertex AI. The other endpoint is using decorators to trace nested function calls. There is also the option to trace custom llm apis using OpenLLMetry.
The impletion dedicated to OpenLLMetry is under src/root.py
and src/services/
. There you can see examples of the implementation of OpenLLMetry with GCP.
This project is licensed under the MIT License - see the LICENSE file for details.