Skip to content

Support denvr endpoints with Litellm. #2085

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

srinarayan-srikanthan
Copy link
Collaborator

Description

Support remote inference with denvr endpoint for chatqna with readme updates.

Issues

#2084

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

N/A

Tests

Describe the tests that you ran to verify your changes.

Ubuntu added 3 commits June 20, 2025 02:52
Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>
Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>
Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>
@Copilot Copilot AI review requested due to automatic review settings June 20, 2025 03:43
Copy link

github-actions bot commented Jun 20, 2025

Dependency Review

✅ No vulnerabilities or license issues found.

Scanned Files

None

Copilot

This comment was marked as outdated.

pre-commit-ci bot and others added 3 commits June 20, 2025 03:44
Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for deploying ChatQnA with remote denvr inference endpoints and updates the streaming response parser for multi-chunk JSON outputs.

  • Introduce a new compose_remote.yaml workflow and environment variable instructions in the Xeon CPU Docker README.
  • Update align_generator in chatqna.py to split and process multiple JSON chunks per line.

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.

File Description
ChatQnA/docker_compose/intel/cpu/xeon/README.md Added remote endpoint deployment steps and updated compose table
ChatQnA/chatqna.py Refactored align_generator to handle multi-chunk streaming JSON
Comments suppressed due to low confidence (3)

ChatQnA/docker_compose/intel/cpu/xeon/README.md:78

  • [nitpick] Clarify whether REMOTE_ENDPOINT should include the /v1/chat/completions path or just the base URL to avoid confusion.
**Note**: Set REMOTE_ENDPOINT variable value to "https://api.inference.denvrdata.com" when the remote endpoint to access is "https://api.inference.denvrdata.com/v1/chat/completions"

ChatQnA/chatqna.py:178

  • [nitpick] The outer variable line is reused for the inner loop below, which can reduce readability; consider renaming the loop variable to chunk or similar.
        chunks = [chunk.strip() for chunk in line.split("\n\n") if chunk.strip()]

ChatQnA/chatqna.py:191

  • The previous finish_reason check was removed, which may cause tokens to be emitted after the stream should end; consider re-adding or documenting this behavior change.
                elif "content" in json_data["choices"][0]["delta"]:

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant