Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API] Make dataset attribute optional #85

Merged
merged 2 commits into from
Jan 17, 2023

Conversation

younesbelkada
Copy link
Contributor

What does this PR do?

This PR makes the dataset attribute optional for the PPOTrainer to simplify the API. Currently one needs to create a Dataset class in order to initialize the PPOTrainer. We can make things simpler by making this attribute optional.
Though the PPOTrainer training loop relies heavily on the config.batch_size attribute, I suggest to warn users to make sure that they have set config.batch_size to the correct batch_size before running anything

Also updated and tested the script on the REAMDE

@lvwerra

- add a warning for `batch_size`issue
- added tests
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 16, 2023

The documentation is not available anymore as the PR was closed or merged.

@younesbelkada younesbelkada mentioned this pull request Jan 16, 2023
26 tasks
Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@younesbelkada younesbelkada merged commit 7a4780a into huggingface:main Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants