Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symlink resolution during configuration detection might break some CLI commands #2346

Closed
astrojuanlu opened this issue Feb 20, 2023 · 12 comments · Fixed by #3742
Closed

Symlink resolution during configuration detection might break some CLI commands #2346

astrojuanlu opened this issue Feb 20, 2023 · 12 comments · Fixed by #3742
Assignees
Labels
Issue: Bug Report 🐞 Bug that needs to be fixed

Comments

@astrojuanlu
Copy link
Member

Description

While debugging another issue, I found an interesting behavior that only happens in very specific circumstances. Since on macOS /tmp is a symlink to /private/tmp, while trying to instantiate a KedroCLI object I was getting a long error:

In [2]: from kedro.framework.cli.cli import KedroCLI

In [3]: cli = KedroCLI("/tmp/test-kedro-ipython-regular/")
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
...
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Source path '/private/tmp/test-kedro-ipython-regular/src' has to be relative to your project root 
'/tmp/test-kedro-ipython-regular/'.

which of course was addressed by using the resolved path instead:

In [10]: cli = KedroCLI("/private/tmp/test-kedro-ipython-regular/")

In [11]: cli
Out[11]: <KedroCLI None>

Context

I guess this never happens in real life because of how the codebase calls the initializer, but looks like the brittleness could be averted by replacing .resolve() with .absolute() here:

project_path = Path(project_path).expanduser().resolve()

I'm not sure if there are other consequences potentially hidden. This line has been there "forever" (dc39a96).

Steps to Reproduce

(given above)

Expected Result

The KedroCLI can be instantiated in paths regardless of symlinks.

Your Environment

Include as many relevant details about the environment in which you experienced the bug:

  • Kedro version used (pip show kedro or kedro -V): 0.18.4
  • Python version used (python -V): 3.10.9
  • Operating system and version: macOS Ventura 13.2.1
@merelcht
Copy link
Member

Test if this also breaks datasets or only the CLI.

@astrojuanlu astrojuanlu added this to the Make it easier to use Kedro as a library milestone Aug 29, 2023
@yetudada yetudada modified the milestones: Make it easier to use Kedro as a library, Using Kedro with existing projects Sep 4, 2023
@noklam
Copy link
Contributor

noklam commented Sep 5, 2023

Opposite idea, what about we make sure both project_path and source_path are resolved? On the other hand, I think this is a rather niche case and not necessary be included in the milestone. In any case the priority is "Low". @astrojuanlu

@astrojuanlu
Copy link
Member Author

A similar problem was spotted by a user on Windows (which doesn't have symlinks, unsure) https://linen-slack.kedro.org/t/15723534/hello-everyone-i-m-relatively-new-to-learning-kedro-and-i-ve#36c3d6d1-3934-4b02-a000-95a3f76e18fb

image

(Notice the paths in the traceback)

@noklam
Copy link
Contributor

noklam commented Sep 18, 2023

Windows do have something similar to symlink, used to do that with Powershell

@astrojuanlu
Copy link
Member Author

Another user affected by this on Windows: https://linen-slack.kedro.org/t/16006555/hello-everyone-anyone-ever-tried-using-prefect-on-their-pipe#2851de33-7433-4492-bd70-c551ad7b2422

Finished in state Failed("Flow run encountered an exception. ValueError: Source path 'C:\\Users\\KodeCraft-3\\AppData\\Local\\Temp\\tmpcc05q7ogprefect\\src' has to be relative to your project root 'C:\\Users\\KODECR~1\\AppData\\Local\\Temp\\tmpcc05q7ogprefect'.")

@astrojuanlu
Copy link
Member Author

I think Prefect specifically might be generating these temporary directories and hitting that error in the codebase.

@astrojuanlu
Copy link
Member Author

A workaround for Prefect:

He just replaced the project_path instead of Path.cwd() to the absolute path of the root directory, it seems like Path.cwd() doesn't read the root directory of the project file.

https://linen-slack.kedro.org/t/16006555/hello-everyone-anyone-ever-tried-using-prefect-on-their-pipe#b0ffc149-6686-4252-a78a-44abfc20b222

@astrojuanlu
Copy link
Member Author

@astrojuanlu astrojuanlu changed the title Symlink resolution during configuration detection might break CLI instantiation Symlink resolution during configuration detection might break some CLI commands Dec 17, 2023
@astrojuanlu astrojuanlu added the Issue: Bug Report 🐞 Bug that needs to be fixed label Dec 17, 2023
@astrojuanlu
Copy link
Member Author

Another user communicated in private that they're also having this problem:

Source path 'C:\Users\USERNAME\OneDrive - McKinsey & Company\Desktop\Code\kedro_test\spaceflight-pandas\src' has to be relative to your project root 'C:\Users\USERNAME\Desktop\Code\kedro_test\spaceflight-pandas'.

So this affects Windows too.

@cyrenaique
Copy link

cyrenaique commented Mar 6, 2024

any news about the UNC problem with windows? :

 (Spaceflights Pandas): Spa
\\mount\data\Temp\spa
**CMD.EXE was started with the above path as the current directory.
UNC paths are not supported.  Defaulting to Windows directory.**

Congratulations!
Your project 'Spa' has been created in the directory
\\mount\data\Temp\spa

and after the kedro run doesn't run:

🔋 100% 🕙[ 09:15:35 ] ➜ kedro run
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ \site-packages\kedro\framework\startup.py:136 │
│ in _validate_source_path                                                                         │
│                                                                                                  │
│ Lib\pathlib.py:730 in relative_to                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: '\\\\mount\\data\\Temp\\spa\\src' is not in the subpath of 'L:\\Temp\\spa' OR one path is relative and the other is absolute.

The above exception was the direct cause of the following exception:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in _run_module_as_main:198                                                                       │
│ in _run_code:88                                                                                  │
│                                                                                                  │
│ in <module>:7                                                                                    │
│                                                                                                  │
│ Lib\site-packages\kedro\framework\cli\cli.py:197 │
│ in main                                                                                          │
│                                                                                                  │
│Lib\site-packages\kedro\framework\cli\cli.py:99  │
│ in __init__                                                                                      │
│                                                                                                  │
│Lib\site-packages\kedro\framework\startup.py:163 │
│ in bootstrap_project                                                                             │
│                                                                                                  │
│Lib\site-packages\kedro\framework\startup.py:147 │
│ in _add_src_to_path                                                                              │
│                                                                                                  │
│Lib\site-packages\kedro\framework\startup.py:138 │
│ in _validate_source_path                                                                         │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Source path '\\mount\data\Temp\spa\src' has to be relative to your project root 'L:\Temp\spa'.

@astrojuanlu
Copy link
Member Author

Thanks for the extra info @cyrenaique. This is already high priority for us, we hope to get to it soon.

@noklam
Copy link
Contributor

noklam commented Mar 26, 2024

I didn't get any error by following the description, then I notice I need to create the project first to trigger the error.

cd /tmp
kedro new -n test-kedro-ipython-regular
In [2]: from kedro.framework.cli.cli import KedroCLI

In [3]: cli = KedroCLI("/tmp/test-kedro-ipython-regular/")

ERROR

@noklam noklam mentioned this issue Mar 26, 2024
8 tasks
@noklam noklam linked a pull request Mar 26, 2024 that will close this issue
8 tasks
@noklam noklam moved this from In Progress to In Review in Kedro Framework Mar 28, 2024
@github-project-automation github-project-automation bot moved this from In Review to Done in Kedro Framework Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Bug Report 🐞 Bug that needs to be fixed
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants