Skip to content

Client-side prompt token count is inaccurate #75

Closed
@sjmonson

Description

@sjmonson

The client-side tokenization in guidellm fails to account for the extra tokens added in the server's chat prompt template. There are two possible workarounds:

  1. Enable usage metrics in each request and let the server tell us how many prompt tokens there are.
  2. Use the /completions endpoint rather than /chat/completions as the chat template is not applied on the /completions endpoint.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions