Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pythonpath "." before loading modules #2673

Merged
merged 16 commits into from
Aug 20, 2024

Conversation

arbaobao
Copy link
Contributor

@arbaobao arbaobao commented Aug 9, 2024

Tracking issue

Why are the changes needed?

Sometimes User defined Dockfile is missing "ENV PYTHONPATH /root", it will cause execute errors.

What changes were proposed in this pull request?

This change will add pythonpath "." before loading modules

How was this patch tested?

build a docker image with Dockfile missing "ENV PYTHONPATH /root"

Setup process

Screenshots

截圖 2024-08-10 清晨7 38 41

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Copy link

codecov bot commented Aug 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.59%. Comparing base (a8f68d7) to head (4afb6ae).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2673      +/-   ##
==========================================
- Coverage   78.91%   77.59%   -1.33%     
==========================================
  Files         316      278      -38     
  Lines       24965    23394    -1571     
  Branches     4012     4012              
==========================================
- Hits        19702    18152    -1550     
+ Misses       4548     4521      -27     
- Partials      715      721       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also remove the PYTHONPATH from this dockerfile? Just want to make sure it works in the integration tests after we remove the PYTHONPATH.

@@ -376,6 +376,9 @@ def _execute_task(
dynamic_addl_distro,
dynamic_dest_dir,
) as ctx:
import sys

sys.path.append(".")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
sys.path.append(".")
if "." not in sys.path:
sys.path.append(".")

Signed-off-by: Nelson Chen <asd3431090@gmail.com>
@arbaobao
Copy link
Contributor Author

arbaobao commented Aug 13, 2024

@pingsutw I has removed the pythonpath in Dockerfile.dev.

Copy link
Contributor

@wild-endeavor wild-endeavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know what happens if both the current working directory, as an absolute path, and this new '.' are both in the sys path? does that affect python module loading at all?

@arbaobao
Copy link
Contributor Author

@wild-endeavor I think append "." into sys.path is not going to affect python module loading. Just to make sure that user defined Dockerfile works without pythonpath=/root

Comment on lines 381 to 382
if "." not in sys.path:
sys.path.append(".")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best to be explicit here and add the fully path for the current working directory:

Suggested change
if "." not in sys.path:
sys.path.append(".")
sys.path.append(os.getcwd())

Although, I do not think there is harm to include the working directory twice. If you want to make sure to not add the working directory twice:

working_dir = os.getcwd()
if all(os.path.realpath(path) != working_dir for path in sys.path):
    sys.path.append(working_dir)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed. This seems safer than adding .. I'm worried about users who use both .relative imports as well as absolute imports. We've seen weird cases of repeat-importing of modules before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thank you for your advice.

Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add dest_dir to PYTHONPATH as well?

@thomasjpfan
Copy link
Member

Could we add dest_dir to PYTHONPATH as well?

Yea, that would give a better experience. In that case, I would inject it into PYTHONPATH in Popen:

p = subprocess.Popen(cmd)

I'll say do it in another PR, so we can discuss dest_dir independent of this PR.

Signed-off-by: Nelson Chen <asd3431090@gmail.com>
Signed-off-by: Nelson Chen <asd3431090@gmail.com>
@arbaobao
Copy link
Contributor Author

I'll say do it in another PR, so we can discuss dest_dir independent of this PR.

I think i will create another PR to discuss dest_dir.
Some of the test errors are due to this error.

@wild-endeavor
Copy link
Contributor

yeah let's discuss dest dir in another pr... if we ever move to a different model, like one where another worker maybe written in a different language is first in charge of downloading and extracting the package, it may be weird for that worker to be setting the pythonpath.

wild-endeavor
wild-endeavor previously approved these changes Aug 16, 2024
Signed-off-by: Nelson Chen <asd3431090@gmail.com>
@arbaobao
Copy link
Contributor Author

@thomasjpfan @wild-endeavor @pingsutw
I think we can add dest_dir instead of adding "." because if we don't give --destination-dir=/xxx when we executes, it uses the default value "." .

@thomasjpfan
Copy link
Member

I think we can add dest_dir instead of adding "." because if we don't give --destination-dir=/xxx when we executes, it uses the default value "." .

I think it's possible to have dest_dir be different from the working directory.

For example, one can set dest_dir=/home/flytekit/my_dest and working directory is /root. There could be python modules in working_dir=/root baked into the image and then fast registration copies files to dest_dir=/home/flytekit/my_dest. In this case, adding /home/flytekit/my_dest to sys.path is not enough. Either the image or flytekit needs to still needs to add /root into the PYTHONPATH.

@arbaobao
Copy link
Contributor Author

Got it. Thank you. i will create another PR to discuss dest_dir

Signed-off-by: Nelson Chen <asd3431090@gmail.com>
Signed-off-by: Nelson Chen <asd3431090@gmail.com>
@wild-endeavor wild-endeavor merged commit 6bcedc3 into flyteorg:master Aug 20, 2024
98 of 101 checks passed
@@ -50,8 +50,5 @@ RUN SETUPTOOLS_SCM_PRETEND_VERSION_FOR_FLYTEKIT=$PSEUDO_VERSION \
&& chown flytekit: /home \
&& :


ENV PYTHONPATH="/flytekit:/flytekit/plugins/flytekit-k8s-pod:/flytekit/plugins/flytekit-deck-standard:"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing this broke flyteremote integration tests due to the way tasks+workflows are registered. The tests end up present in the image under the tests module, which lives under the flytekit directory and since we had /flytekit as part of PYTHONPATH the task was able to be run once pyflyte-execute ran.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix in #2701.

@eapolinario eapolinario mentioned this pull request Aug 22, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants