-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Ray Fast Register #2606
Enable Ray Fast Register #2606
Conversation
Signed-off-by: Jan Fiedler <jan@union.ai>
Signed-off-by: Jan Fiedler <jan@union.ai>
Signed-off-by: Jan Fiedler <jan@union.ai>
|
||
# fast register data with timestamp mtime=0 will be zipped and uploaded to ray gcs | ||
# zip does not support timestamps before 1980 -> hacky workaround of touching all the files | ||
os.system(f"touch `find {working_dir} -type f`") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends on the system having touch
+ find
, which is usually the case.
To be safer, can this use os.utime + os.walk (or Path.rglob) to update the mtimes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I can try this out & adjust. Also this should only stay short term. @pingsutw mentioned there is work in progress that allows us to not set mtime=0
for the fast register tar.gz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do not think we will do this right now @fiedlerNr9, because the mtime=0 makes it possible to get consistent hashes for the similar tar files, otherwise we get multiple uploads to admin.
I think you should do what @thomasjpfan recommends use os.walk
. the os.system will break and will cause random bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to manually adjust the timestamp for this now. I just catch the fast register data and dont zip it at all
init_params["runtime_env"]["excludes"] = cfg.excludes_working_dir | ||
|
||
# fast register data with timestamp mtime=0 will be zipped and uploaded to ray gcs | ||
# zip does not support timestamps before 1980 -> hacky workaround of touching all the files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this comment say explicitly how the data is being moved around? My understanding is:
- Flyte's fast register and sends it to the head node
- Ray then zips the working directory and sends it to the ray gcs
- Ray worker pulls the files from ray's gcs.
Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its exactly this only with the addition that ray respects the given excludes
variable (list[str]
), which functions as a ignore file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qq: could we exclude the file during the fast registration? (.gitignore)
Do we have a case where we want to upload a file to the head node but don't want it on the worker nodes?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a case where we want to upload a file to the head node but don't want it on the worker nodes?
No i dont think so.
What file are you trying to exclude? The fast register .tar.gz
? Actually I am thinking, its only the .tar.gz file coming with the mtime=0
right? If we exclude that we dont need to modify any timestamp on files?
Signed-off-by: Jan Fiedler <jan@union.ai>
f98f88d
to
061698f
Compare
Signed-off-by: Jan Fiedler <jan@union.ai>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2606 +/- ##
===========================================
+ Coverage 47.17% 78.56% +31.38%
===========================================
Files 230 185 -45
Lines 21322 18881 -2441
Branches 3711 3714 +3
===========================================
+ Hits 10059 14834 +4775
+ Misses 11154 3376 -7778
- Partials 109 671 +562 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Jan Fiedler <jan@union.ai>
Signed-off-by: Jan Fiedler <jan@union.ai>
Okay no need to manually update timestamps here. I am just always put the in the exclude variable, since there is no need to upload the .tar.gz to the ray gcs.
Works and tested with pyflyte run and register. Would appreciate another review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jus to reiterate, this looks great. Thanks, @fiedlerNr9 !
Signed-off-by: Jan Fiedler <jan@union.ai>
Signed-off-by: Jan Fiedler <jan@union.ai> Signed-off-by: mao3267 <chenvincent610@gmail.com>
…class] (#2603) * fix: set dataclass member as optional if default value is provided Signed-off-by: mao3267 <chenvincent610@gmail.com> * lint Signed-off-by: mao3267 <chenvincent610@gmail.com> * feat: handle nested dataclass conversion in JsonParamType Signed-off-by: mao3267 <chenvincent610@gmail.com> * fix: handle errors caused by NoneType default value Signed-off-by: mao3267 <chenvincent610@gmail.com> * test: add nested dataclass unit tests Signed-off-by: mao3267 <chenvincent610@gmail.com> * Sagemaker dict determinism (#2597) * truncate sagemaker agent outputs Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix tests and update agent output Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * lint Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix test Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add idempotence token to workflow Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix type Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix mixin Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * modify output handler Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * make the dictionary deterministic Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * nit Signed-off-by: Samhita Alla <aallasamhita@gmail.com> --------- Signed-off-by: Samhita Alla <aallasamhita@gmail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * refactor(core): Enhance return type extraction logic (#2598) Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Feat: Make exception raised by external command authenticator more actionable (#2594) Signed-off-by: Fabio Grätz <fabiogratz@googlemail.com> Co-authored-by: Fabio Grätz <fabiogratz@googlemail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Fix: Properly re-raise non-grpc exceptions during refreshing of proxy-auth credentials in auth interceptor (#2591) Signed-off-by: Fabio Grätz <fabiogratz@googlemail.com> Co-authored-by: Fabio Grätz <fabiogratz@googlemail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * validate idempotence token length in subsequent tasks (#2604) * validate idempotence token length in subsequent tasks Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * remove redundant param Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add tests Signed-off-by: Samhita Alla <aallasamhita@gmail.com> --------- Signed-off-by: Samhita Alla <aallasamhita@gmail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Add nvidia-l4 gpu accelerator (#2608) Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * eliminate redundant literal conversion for `Iterator[JSON]` type (#2602) * eliminate redundant literal conversion for type Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add test Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * lint Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add isclass check Signed-off-by: Samhita Alla <aallasamhita@gmail.com> --------- Signed-off-by: Samhita Alla <aallasamhita@gmail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * [FlyteSchema] Fix numpy problems (#2619) Signed-off-by: Future-Outlier <eric901201@gmail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * add nim plugin (#2475) * add nim plugin Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * move nim to inference Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * import fix Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix port Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add pod_template method Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add containers Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * update Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * clean up Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * remove cloud import Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix extra config Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * remove decorator Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add tests, update readme Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add env Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add support for lora adapter Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * minor fixes Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add startup probe Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * increase failure threshold Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * remove ngc secret group Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * move plugin to flytekit core Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix docs Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * remove hf group Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * modify podtemplate import Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix import Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix ngc api key Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix tests Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix formatting Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * lint Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * docs fix Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * docs fix Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * update secrets interface Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add secret prefix Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * fix tests Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add urls Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add urls Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * remove urls Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * minor modifications Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * remove secrets prefix; add failure threshold Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add hard-coded prefix Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * add comment Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * make secrets prefix a required param Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * move nim to flytekit plugin Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * update readme Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * update readme Signed-off-by: Samhita Alla <aallasamhita@gmail.com> * update readme Signed-off-by: Samhita Alla <aallasamhita@gmail.com> --------- Signed-off-by: Samhita Alla <aallasamhita@gmail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * [Elastic/Artifacts] Pass through model card (#2575) Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Remove pyarrow as a direct dependency (#2228) Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Boolean flag to show local container logs to the terminal (#2521) Signed-off-by: aditya7302 <aditya7302@gmail.com> Signed-off-by: Kevin Su <pingsutw@apache.org> Co-authored-by: Kevin Su <pingsutw@apache.org> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Enable Ray Fast Register (#2606) Signed-off-by: Jan Fiedler <jan@union.ai> Signed-off-by: mao3267 <chenvincent610@gmail.com> * [Artifacts/Elastic] Skip partitions (#2620) Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Install flyteidl from master in plugins tests (#2621) Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Using ParamSpec to show underlying typehinting (#2617) Signed-off-by: JackUrb <jack@datologyai.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Support ArrayNode mapping over Launch Plans (#2480) * set up array node Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * wip array node task wrapper Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * support function like callability Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * temp check in some progress on python func wrapper Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * only support launch plans in new array node class for now Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * add map task array node implementation wrapper Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * ArrayNode only supports LPs for now Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * support local execute for new array node implementation Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * add local execute unit tests for array node Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * set exeucution version in array node spec Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * check input types for local execute Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * remove code that is un-needed for now Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * clean up array node class Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * improve naming Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * clean up Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * utilize enum execution mode to set array node execution path Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * default execution mode to FULL_STATE for new array node class Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * support min_successes for new array node Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * add map task wrapper unit test Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * set min successes for array node map task wrapper Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * update docstrings Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * Install flyteidl from master in plugins tests Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> * lint Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * clean up min success/ratio setting Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * lint Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> * make array node class callable Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> --------- Signed-off-by: Paul Dittamo <pvdittamo@gmail.com> Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Richer printing for some artifact objects (#2624) Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * ci: Add Python 3.9 to build matrix (#2622) Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Signed-off-by: Future-Outlier <eric901201@gmail.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: Future-Outlier <eric901201@gmail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * bump (#2627) Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Added alt prefix head to FlyteFile.new_remote (#2601) * Added alt prefix head to FlyteFile.new_remote Signed-off-by: pryce-turner <pryce.turner@gmail.com> * Added get_new_path method to FileAccessProvider, fixed new_remote method of FlyteFile Signed-off-by: pryce-turner <pryce.turner@gmail.com> * Updated tests and added new path creator to FlyteFile/Dir new_remote methods Signed-off-by: pryce-turner <pryce.turner@gmail.com> * Improved docstrings, fixed minor path sep bug, more descriptive naming, better test Signed-off-by: pryce-turner <pryce.turner@gmail.com> * Formatting Signed-off-by: pryce-turner <pryce.turner@gmail.com> --------- Signed-off-by: pryce-turner <pryce.turner@gmail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Feature gate for FlyteMissingReturnValueException (#2623) Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Remove use of multiprocessing from the OAuth client (#2626) * Remove use of multiprocessing from the OAuth client Signed-off-by: Robert Deaton <robert.deaton@freenome.com> * Lint Signed-off-by: Robert Deaton <robert.deaton@freenome.com> --------- Signed-off-by: Robert Deaton <robert.deaton@freenome.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Update codespell in precommit to version 2.3.0 (#2630) Signed-off-by: mao3267 <chenvincent610@gmail.com> * Fix Snowflake Agent Bug (#2605) * fix snowflake agent bug Signed-off-by: Future-Outlier <eric901201@gmail.com> * a work version Signed-off-by: Future-Outlier <eric901201@gmail.com> * Snowflake work version Signed-off-by: Future-Outlier <eric901201@gmail.com> * fix secret encode Signed-off-by: Future-Outlier <eric901201@gmail.com> * all works, I am so happy Signed-off-by: Future-Outlier <eric901201@gmail.com> * improve additional protocol Signed-off-by: Future-Outlier <eric901201@gmail.com> * fix tests Signed-off-by: Future-Outlier <eric901201@gmail.com> * Fix Tests Signed-off-by: Future-Outlier <eric901201@gmail.com> * update agent Signed-off-by: Kevin Su <pingsutw@apache.org> * Add snowflake test Signed-off-by: Kevin Su <pingsutw@apache.org> * nit Signed-off-by: Kevin Su <pingsutw@apache.org> * sd Signed-off-by: Kevin Su <pingsutw@apache.org> * snowflake loglinks Signed-off-by: Future-Outlier <eric901201@gmail.com> * add metadata Signed-off-by: Future-Outlier <eric901201@gmail.com> * secret Signed-off-by: Kevin Su <pingsutw@apache.org> * nit Signed-off-by: Kevin Su <pingsutw@apache.org> * remove table Signed-off-by: Future-Outlier <eric901201@gmail.com> * add comment for get private key Signed-off-by: Future-Outlier <eric901201@gmail.com> * update comments: Signed-off-by: Future-Outlier <eric901201@gmail.com> * Fix Tests Signed-off-by: Future-Outlier <eric901201@gmail.com> * update comments Signed-off-by: Future-Outlier <eric901201@gmail.com> * update comments Signed-off-by: Future-Outlier <eric901201@gmail.com> * Better Secrets Signed-off-by: Future-Outlier <eric901201@gmail.com> * use union secret Signed-off-by: Future-Outlier <eric901201@gmail.com> * Update Changes Signed-off-by: Future-Outlier <eric901201@gmail.com> * use if not get_plugin().secret_requires_group() Signed-off-by: Future-Outlier <eric901201@gmail.com> * Use Union SDK Signed-off-by: Future-Outlier <eric901201@gmail.com> * Update Signed-off-by: Future-Outlier <eric901201@gmail.com> * Fix Secrets Signed-off-by: Future-Outlier <eric901201@gmail.com> * Fix Secrets Signed-off-by: Future-Outlier <eric901201@gmail.com> * remove pacakge.json Signed-off-by: Future-Outlier <eric901201@gmail.com> * lint Signed-off-by: Future-Outlier <eric901201@gmail.com> * add snowflake-connector-python Signed-off-by: Future-Outlier <eric901201@gmail.com> * fix test_snowflake Signed-off-by: Future-Outlier <eric901201@gmail.com> * Try to fix tests Signed-off-by: Future-Outlier <eric901201@gmail.com> * fix tests Signed-off-by: Future-Outlier <eric901201@gmail.com> * Try Fix snowflake Import Signed-off-by: Future-Outlier <eric901201@gmail.com> * snowflake test passed Signed-off-by: Future-Outlier <eric901201@gmail.com> --------- Signed-off-by: Future-Outlier <eric901201@gmail.com> Signed-off-by: Kevin Su <pingsutw@apache.org> Co-authored-by: Kevin Su <pingsutw@apache.org> Signed-off-by: mao3267 <chenvincent610@gmail.com> * run test_missing_return_value on python 3.10+ (#2637) Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: mao3267 <chenvincent610@gmail.com> * [Elastic] Fix context usage and apply fix to fork method (#2628) Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Add flytekit-omegaconf plugin (#2299) * add flytekit-hydra Signed-off-by: mg515 <miha.garafolj@gmail.com> * fix small typo readme Signed-off-by: mg515 <miha.garafolj@gmail.com> * ruff ruff Signed-off-by: mg515 <miha.garafolj@gmail.com> * lint more Signed-off-by: mg515 <miha.garafolj@gmail.com> * rename plugin into flytekit-omegaconf Signed-off-by: mg515 <miha.garafolj@gmail.com> * lint sort imports Signed-off-by: mg515 <miha.garafolj@gmail.com> * use flytekit logger Signed-off-by: mg515 <miha.garafolj@gmail.com> * use flytekit logger #2 Signed-off-by: mg515 <miha.garafolj@gmail.com> * fix typing info in is_flatable Signed-off-by: mg515 <miha.garafolj@gmail.com> * use default_factory instead of mutable default value Signed-off-by: mg515 <miha.garafolj@gmail.com> * add python3.11 and python3.12 to setup.py Signed-off-by: mg515 <miha.garafolj@gmail.com> * make fmt Signed-off-by: mg515 <miha.garafolj@gmail.com> * define error message only once Signed-off-by: mg515 <miha.garafolj@gmail.com> * add docstring Signed-off-by: mg515 <miha.garafolj@gmail.com> * remove GenericEnumTransformer and tests Signed-off-by: mg515 <miha.garafolj@gmail.com> * fallback to TypeEngine.get_transformer(node_type) to find suitable transformer Signed-off-by: mg515 <miha.garafolj@gmail.com> * explicit valueerrors instead of asserts Signed-off-by: mg515 <miha.garafolj@gmail.com> * minor style improvements Signed-off-by: mg515 <miha.garafolj@gmail.com> * remove obsolete warnings Signed-off-by: mg515 <miha.garafolj@gmail.com> * import flytekit logger instead of instantiating our own Signed-off-by: mg515 <miha.garafolj@gmail.com> * docstrings in reST format Signed-off-by: mg515 <miha.garafolj@gmail.com> * refactor transformer mode Signed-off-by: mg515 <miha.garafolj@gmail.com> * improve docs Signed-off-by: mg515 <miha.garafolj@gmail.com> * refactor dictconfig class into smaller methods Signed-off-by: mg515 <miha.garafolj@gmail.com> * add unit tests for dictconfig transformer Signed-off-by: mg515 <miha.garafolj@gmail.com> * refactor of parse_type_description() Signed-off-by: mg515 <miha.garafolj@gmail.com> * add omegaconf plugin to pythonbuild.yaml --------- Signed-off-by: mg515 <miha.garafolj@gmail.com> Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Adds extra-index-url to default image builder (#2636) Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Kevin Su <pingsutw@apache.org> Signed-off-by: mao3267 <chenvincent610@gmail.com> * reference_task should inherit from PythonTask (#2643) Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: mao3267 <chenvincent610@gmail.com> * Fix Get Agent Secret Using Key (#2644) Signed-off-by: Future-Outlier <eric901201@gmail.com> Signed-off-by: mao3267 <chenvincent610@gmail.com> * fix: prevent converting Flyte types as custom dataclasses Signed-off-by: mao3267 <chenvincent610@gmail.com> * fix: add None to output type Signed-off-by: mao3267 <chenvincent610@gmail.com> * test: add unit test for nested dataclass inputs Signed-off-by: mao3267 <chenvincent610@gmail.com> * test: add unit tests for nested dataclass, dataclass default value as None, and flyte type exceptions Signed-off-by: mao3267 <chenvincent610@gmail.com> * fix: handle NoneType as default value of list type dataclass members Signed-off-by: mao3267 <chenvincent610@gmail.com> * fix: add comments for `has_nested_dataclass` function Signed-off-by: mao3267 <chenvincent610@gmail.com> * fix: make lint Signed-off-by: mao3267 <chenvincent610@gmail.com> * fix: update tests regarding input through file and pipe Signed-off-by: mao3267 <chenvincent610@gmail.com> * Make JsonParamType convert faster Signed-off-by: Future-Outlier <eric901201@gmail.com> * make has_nested_dataclass func more clean and add tests for dataclass_with_optional_fields Signed-off-by: Future-Outlier <eric901201@gmail.com> * make logic more backward compatible Signed-off-by: Future-Outlier <eric901201@gmail.com> * fix: handle indexing errors in dict/list while checking nested dataclass, add comments Signed-off-by: mao3267 <chenvincent610@gmail.com> --------- Signed-off-by: mao3267 <chenvincent610@gmail.com> Co-authored-by: Kevin Su <pingsutw@apache.org> Co-authored-by: Future-Outlier <eric901201@gmail.com>
Why are the changes needed?
Today, when launching a Ray job in Flyte only the Head Node is retrieving fast register data - no applying for Ray Worker Nodes. This results in the need for the user of copy the source code into the containers working dir & rebuilding the container image for every change on the ray user code.
This PR utilizes rays functionality of adding files to the ray job on runtime passing the
runtime_env {"working_dir":"some_path"}
.excludes
variable along withworking_dir
which functions as a .ignore file. This PR introduces aexcludes_working_dir
to the RayJobConfig, which allows control over what files to upload to the ray cluster.What changes were proposed in this pull request?
working_dir
&excludes
init parameter when running remotelyray.init("working_dir"="/root", excludes=["data"])
zips all fast register data, takesexcludes
into account and uploads to ray gcs -> ray worker nodes downloads zips from ray gcs and unzipsHow was this patch tested?
Setup process
Screenshots
Check all the applicable boxes