Skip to content

rfc: new benchmark tool #9893

Open
Open
@stas00

Description

@stas00

This issue is to collect notes and ideas on creating a new benchmarking tool.

This is not about the other speed/memory regression project we have been discussing elsewhere.

This is about integration and various comparisons that we need to run in order to give users the best advice on how to deploy transformers in the most efficient way.

Please share the comments ideas/suggestions/concerns/needs, and I will compile them here.

  • important: not part of examples - the goal is performance and integration tooling and not user-facing - totally different needs and priorities
  • the cmd line has to continue working the same months later - so that old benchmarks could be re-run - ok to change interface with back-compat option so that the old benchmarks can be still re-validated and compared to
  • ideally work with any transformers model - a single tool to rule them all
  • minimal amount of arguments - just the important ones
  • ability to generate markdown table entries directly and json files that contain not just the outcome but also the key variables that are being tested -
  • the report to include critical hardware/software params as well in a compact form and allow these to be merged from multiple recordings - i.e. if the hw/sw are the same - they can be merged into a single report. will need to figure out how to record hardware nuances
    • e.g. the same DDP test with 2 gpus connected w/ NVLink gives dramatically different results than the same 2 gpus w/o NVLink.
    • not sure how to record CPU-capacity/ free RAM, etc., since all these impact the outcome
  • crucial to be able to truncate the dataset

Metadata

Metadata

Assignees

Labels

BenchmarksIssues related to Memory regressions in tests and scriptsWIPLabel your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions