Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TRTLLM Python backend]Fix the output format for client side batching in dynamic batch #1718

Merged
merged 1 commit into from
Apr 3, 2024

Conversation

sindhuvahinis
Copy link
Contributor

@sindhuvahinis sindhuvahinis commented Apr 2, 2024

Description

If the user sends, client side batch then the result is json array. If the user sends single input, then we send back json object.

Examples:
Request : "inputs":["The new movie that got Oscar this year"]
Response :

[
  {
    "generated_text":"The new movie that got Oscar this year is The Shape of Water."
  }
]

Request: "inputs":"The new movie that got Oscar this year"
Response:

{
  "generated_text":"The new movie that got Oscar this year is The Shape of Water."
}

@sindhuvahinis sindhuvahinis requested review from zachgk, frankfliu and a team as code owners April 2, 2024 00:28
@lanking520
Copy link
Contributor

do we want to support client side batching? Maybe just server side is good enough

@sindhuvahinis
Copy link
Contributor Author

do we want to support client side batching? Maybe just server side is good enough

We already do support. This PR just addresses the output format. In our CI, we also test client side batching. Today's CI would fail because the format is wrong.

@sindhuvahinis sindhuvahinis merged commit 0644638 into deepjavalibrary:master Apr 3, 2024
8 checks passed
sindhuvahinis added a commit to sindhuvahinis/djl-serving that referenced this pull request Apr 4, 2024
@sindhuvahinis sindhuvahinis deleted the tests branch April 4, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants