GUIDELLM__MAX_CONCURRENCY is off by 1

The number of concurrent requests in throughput mode is always one less then `Settings.max_concurrency`.

**Example**

```sh
export GUIDELLM__MAX_CONCURRENCY="2"

guidellm --target http://localhost:8000/v1 \
         --model meta-llama/Llama-3.2-3B \
         --data-type emulated \
         --data prompt_tokens=512,generated_tokens=2048 \
         --rate-type throughput \
         --max-seconds 300
```

*Observe from server side that number of requests in queue is 1*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GUIDELLM__MAX_CONCURRENCY is off by 1 #70

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GUIDELLM__MAX_CONCURRENCY is off by 1 #70

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions