Skip to content

Performance discrepancy between submission config and performance-only config #199

@tianmu-li

Description

@tianmu-li

When running submission config (accuracy + performance), accuracy samples are issued right after performance samples, causing accuracy samples to be batched with the last few performance samples. This results in reduced performance using accuracy+performance config especially under high concurrency. Attaching config files used for testing and results. Note that the lack of tokens/s report and "failed" samples are separate issues.
Propose to issue accuracy samples after results from all performance samples have been received.

offline_llama3_1b_cnn_full.yaml
report_1b_full.txt
offline_llama3_1b_cnn_perf.yaml
report_1b_perf.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: config-cliConfig schema, CLI commands, YAMLpriority: P1High — must address this cycletype: bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions