Skip to content

bug: target_qps hardcoded to 10.0 in Offline mode instead of None #219

@arekay-nv

Description

@arekay-nv

Problem

In src/inference_endpoint/config/runtime_settings.py:142-148, target_qps falls back to a hardcoded 10.0 for Offline (max_throughput) mode instead of None:

# TODO: target_qps should be None in Offline mode but using 10.0 as fallback
# to avoid breaking changes
target_qps = config.settings.target_qps or 10.0

In Offline/max_throughput mode, target_qps is semantically irrelevant — all queries are issued at t=0 as a burst. Having it default to 10.0 is misleading and can affect any downstream logic that reads this field.

Expected Behavior

target_qps should be None when the load pattern is max_throughput. The fallback workaround should be removed and callers that depend on this field should be updated to handle None.

Files to Modify

  • src/inference_endpoint/config/runtime_settings.py
  • Any callers that read runtime_settings.target_qps without a None check

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: config-cliConfig schema, CLI commands, YAMLpriority: P1High — must address this cycletype: bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions