Performance discrepancy between submission config and performance-only config

When running submission config (accuracy + performance), accuracy samples are issued right after performance samples, causing accuracy samples to be batched with the last few performance samples. This results in reduced performance using accuracy+performance config especially under high concurrency. Attaching config files used for testing and results. Note that the lack of tokens/s report and "failed" samples are separate issues.
Propose to issue accuracy samples after results from all performance samples have been received. 

[offline_llama3_1b_cnn_full.yaml](https://github.com/user-attachments/files/26222135/offline_llama3_1b_cnn_full.yaml)
[report_1b_full.txt](https://github.com/user-attachments/files/26222136/report_1b_full.txt)
[offline_llama3_1b_cnn_perf.yaml](https://github.com/user-attachments/files/26222134/offline_llama3_1b_cnn_perf.yaml)
[report_1b_perf.txt](https://github.com/user-attachments/files/26222137/report_1b_perf.txt)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance discrepancy between submission config and performance-only config #199

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance discrepancy between submission config and performance-only config #199

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions