Skip to content

Conversation

Venkat2811
Copy link
Contributor

@Venkat2811 Venkat2811 commented Jun 19, 2025

Pull Request Description

  • metrics server support for gateway plurings
  • prometheus service annotation & service monitor option
  • sample dashboard

Related Issues

As discussed in : #1112 (comment)

image

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @Venkat2811, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the observability of the AIBrix gateway plugins by integrating a new Prometheus metrics server directly into the plugin. It includes the necessary Kubernetes configurations for Prometheus to scrape these metrics and provides a new Grafana dashboard for visualizing critical performance indicators. These changes aim to provide deeper insights into the gateway's behavior and performance, especially concerning its routing algorithms.

Highlights

  • Metrics Server Integration: A dedicated HTTP metrics server, utilizing the Prometheus client library, has been embedded within the gateway plugin. This server exposes a /metrics endpoint, enabling the collection of internal plugin metrics.
  • Kubernetes Observability Configuration: The Kubernetes Service and Deployment configurations for the gateway-plugins have been updated to expose the new metrics port (8080). Additionally, Prometheus ServiceMonitor resources have been added to automatically discover and scrape metrics from the gateway plugin.
  • Grafana Dashboard for Gateway Plugins: A new Grafana dashboard has been introduced, providing comprehensive visualizations for key performance indicators related to the gateway plugin's routing algorithms. This includes metrics such as VTC (Virtual Token Counter) bucket size, its rate of change, and Time-to-First-Token (TTFT) latency by model, facilitating performance analysis and tuning.
  • Enhanced Development Workflow: New Makefile targets (dev-install-in-kind, dev-uninstall-from-kind, dev-port-forward, dev-stop-port-forward) have been added. These targets streamline the setup, teardown, and local access to a Kind Kubernetes cluster, including the deployment of Prometheus and Grafana for a complete observability stack.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces metrics server support for gateway plugins, a Grafana dashboard, and enhances the development workflow with new Makefile targets for Kind cluster setup. Key points include updating copyright years, improving error messages, and enhancing logging consistency.

@Venkat2811 Venkat2811 force-pushed the gw_plugin_ext_metric branch from 1012118 to 6ef1a9f Compare June 21, 2025 22:40
@Venkat2811 Venkat2811 marked this pull request as ready for review June 21, 2025 23:44
@Jeffwan Jeffwan force-pushed the gw_plugin_ext_metric branch from 6ef1a9f to c7978aa Compare June 23, 2025 01:47
@Jeffwan
Copy link
Collaborator

Jeffwan commented Jun 23, 2025

The change looks good to me. @varungup90 Please take another look at the metric server for plugin.

Signed-off-by: Venkat Raman <vraman2811@gmail.com>
… observability/monitor

Signed-off-by: Venkat Raman <vraman2811@gmail.com>
Signed-off-by: Venkat Raman <vraman2811@gmail.com>
@Jeffwan Jeffwan force-pushed the gw_plugin_ext_metric branch from c7978aa to c94e700 Compare June 24, 2025 00:30
@varungup90 varungup90 merged commit 80dcc99 into vllm-project:main Jun 24, 2025
14 checks passed
ModiCodeCraftsman pushed a commit to ModiCodeCraftsman/aibrix that referenced this pull request Jun 25, 2025
…lm-project#1211)

Signed-off-by: Venkat Raman <vraman2811@gmail.com>
Signed-off-by: Modi Tamam <modi.tamam@gmail.com>
ModiCodeCraftsman pushed a commit to ModiCodeCraftsman/aibrix that referenced this pull request Jun 25, 2025
…lm-project#1211)

Signed-off-by: Venkat Raman <vraman2811@gmail.com>
Signed-off-by: Modi Tamam <modi.tamam@gmail.com>
Yaegaki1Erika pushed a commit to Yaegaki1Erika/aibrix that referenced this pull request Jul 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants