-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ray] Fix datasets_modules
ImportError with Ray Tune
#12749
[ray] Fix datasets_modules
ImportError with Ray Tune
#12749
Conversation
dynamic_modules_path = os.path.join(datasets.load.init_dynamic_modules(), "__init__.py") | ||
# load dynamic_modules from path | ||
spec = importlib.util.spec_from_file_location("datasets_modules", dynamic_modules_path) | ||
datasets_modules = importlib.util.module_from_spec(spec) | ||
sys.modules[spec.name] = datasets_modules | ||
spec.loader.exec_module(datasets_modules) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to use runtime environments here instead? just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not without editing Ray Tune itself as it would need to be added to an Actor option in trial executor. Also it doesn't appear you can actually import a module from path in a runtime env (only pip and conda), unless I missed that in the docs
dynamic_modules
ImportError with Ray Tunedatasets_modules
ImportError with Ray Tune
dynamic_modules_path = os.path.join(datasets.load.init_dynamic_modules(), "__init__.py") | ||
# load dynamic_modules from path | ||
spec = importlib.util.spec_from_file_location("datasets_modules", dynamic_modules_path) | ||
datasets_modules = importlib.util.module_from_spec(spec) | ||
sys.modules[spec.name] = datasets_modules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another comment: should this be moved upstream to datasets
eventually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it belongs there. The actual import needs to somewhere in Tune, and here is the most convenient place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK that's fine then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great to me :)
Great! @richardliaw , another approval is still required? |
@sgugger @LysandreJik could you help take a look :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What does this PR do?
This PR fixes an ImportError throws due to
datasets_modules
not being loaded on Ray Actors when tuning hyperparamters with Ray, fixing the following issues:huggingface/blog#106
huggingface/transformers#11565
https://discuss.huggingface.co/t/using-hyperparameter-search-in-trainer/785/34
https://discuss.huggingface.co/t/using-hyperparameter-search-in-trainer/785/35
Fixes #11565
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.