Allow max_workers to be passed in after evaluator is created #107

danmcp · 2024-08-23T23:37:20Z

The reason this is necessary is serving_gpus might not be available when the evaluator is constructed. A typical flow might be:

Create evaluator
Launch server # This is when serving_gpus is probably calculated since it will depend on the model being served
generate
judge

The new logic allows for max_workers and serving_gpus to be passed in as the generate and judge steps occur.

Once all the known callers have been updated (eval and training) we can remove the two attrs from the constructors and make the logic a little simpler.

Corresponding cli change: instructlab/instructlab#2144
This change will need to be released first.

nathan-weinberg

One question but otherwise LGTM - thanks @danmcp!

src/instructlab/eval/mt_bench.py

mergify · 2024-09-13T14:25:33Z

This pull request has merge conflicts that must be resolved before it can be
merged. @danmcp please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

The reason this is necessary is serving_gpus might not be available when the evaluator is constructed. A typical flow might be: - Create evaluator - Launch server # This is when serving_gpus is probably calculated - generate - judge The new logic allows for max_workers and serving_gpus to be passed in as the generate and judge steps occur. Once all the known callers have been updated (eval and training) we can remove the two attrs from the constructors and make the logic a little simpler. Signed-off-by: Dan McPherson <dmcphers@redhat.com>

This capability takes advantage of the feature from the eval library which tunes max_workers according to the hardware configuration. In order to accomplish this, effective gpus and max_workers="auto" needs to be passed to eval. To accomplish this, gpus and effective gpus needed to be calculate earlier in the process. This change also updates training to use get_gpus and max_workers=auto. This means training now uses the gpu settings of evaluate when calling evaluate before defaulting to all gpus if not specified. Resolves: #2079 instructlab/eval#107 is now released enabling this change. **Checklist:** - [ ] **Commit Message Formatting**: Commit titles and messages follow guidelines in the [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/#summary). - [ ] [Changelog](https://github.com/instructlab/instructlab/blob/main/CHANGELOG.md) updated with breaking and/or notable changes for the next minor release. - [ ] Documentation has been updated, if necessary. - [x] Unit tests have been added, if necessary. - [ ] Integration tests have been added, if necessary. Approved-by: nathan-weinberg Approved-by: alimaredia

danmcp mentioned this pull request Aug 23, 2024

Add auto max_workers capability instructlab/instructlab#2144

Merged

5 tasks

nathan-weinberg requested a review from a team September 12, 2024 14:56

nathan-weinberg approved these changes Sep 12, 2024

View reviewed changes

src/instructlab/eval/mt_bench.py Show resolved Hide resolved

nathan-weinberg requested a review from a team September 12, 2024 15:22

mergify bot added the one-approval label Sep 12, 2024

mergify bot added the needs-rebase label Sep 13, 2024

danmcp force-pushed the automaxworkers branch from 8e54afd to 52328fd Compare September 13, 2024 15:01

mergify bot removed the needs-rebase label Sep 13, 2024

alinaryan approved these changes Sep 23, 2024

View reviewed changes

mergify bot merged commit 893b6ec into instructlab:main Sep 23, 2024

mergify bot removed the one-approval label Sep 23, 2024

danmcp mentioned this pull request Sep 26, 2024

Remove max_workers and serving_gpus from constructor #140

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow max_workers to be passed in after evaluator is created #107

Allow max_workers to be passed in after evaluator is created #107

Uh oh!

danmcp commented Aug 23, 2024 •

edited

Loading

Uh oh!

nathan-weinberg left a comment

Uh oh!

Uh oh!

mergify bot commented Sep 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Allow max_workers to be passed in after evaluator is created #107

Allow max_workers to be passed in after evaluator is created #107

Uh oh!

Conversation

danmcp commented Aug 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nathan-weinberg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mergify bot commented Sep 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

danmcp commented Aug 23, 2024 •

edited

Loading