[Rate Type] Concurrencies

Hello, 

I am trying to integrate `guidellm` into a benchmark suite. And there we ran different load tests based on use concurrencies. We define user concurrenies as "users" that send requests after each other. Meaning send request -> wait for response -> send next request. 

I first assumed that's what is done with "constant" and "rate" but there is send way more requests as they are send per second. Is there a way to customize the "user concurrency"? I assume that concurrency == synchronous type. But would be create if i could do something like 

```bash
guidellm --target "http://localhost:8080/v1" --model "meta-llama/Meta-Llama-3.1-8B-Instruct"  --data-type emulated --data "prompt_tokens=550,generated_tokens=250" --max-seconds 60 --rate-type concurrent --rate 1 --rate 2 --rate 10 --rate 50 --output-path r.json
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Rate Type] Concurrencies #47

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Rate Type] Concurrencies #47

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions