You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All Kedro CLI commands go through kedro/framework/cli/cli.py::KedroCLI whose __init__ runs bootstrap_project. Kedro's bootstrap_project reads the users project metadata and imports the user settings.py file.
The users settings.py file (at least in modern Kedro versions) by default imports ProjectHooks from the users hooks.py file.
If users choose to register their pipelines via the register_pipelines hooks inside ProjectHooks class (or any other class inside hooks.py) and import their pipelines at the module-level of hooks.py to use them inside register_pipelines, this triggers a chain import from bootstrap_project that imports settings.py -> hooks.py -> imported_pipeline -> imported_pipeline/pipeline.py -> imported_pipeline/nodes.py -> third_party_library.
This is fine for pretty much all CLI commands (though can contribute to slowing CLI commands down for large projects that have many imports, for CLI commands that don't necessarily need the whole project imported to do something) since having the projects requirements installed is a sensible pre-requisite to running any of them (e.g kedro run).
However, the exception to that is kedro install and kedro build-reqs which are presumably written to help users install their requirements. That is, the starting position of using those commands is that the users don't have their project requirements installed.
Third party hooks
If users want to register a third-party hook that doesn't register itself automatically, their settings.py looks something like:
which kedro install imports and runs and fails at if third_party_hook isn't already installed.
Steps to Reproduce
kedro new --starter=pandas-iris
Put a import tensorflow or import non_existent_library in data_engineering/nodes.py and then register the data engineering pipeline (or really just import it) at the top-level of your hooks.py and try running kedro install.
Expected Result
kedro install should execute and install the project requirements.
Actual Result
kedro install fails with an import error.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
Kedro version used (pip show kedro or kedro -V): Kedro 0.17.4
Python version used (python -V): Python 3.7.11
Operating system and version: MacOS Mojave
The text was updated successfully, but these errors were encountered:
Related issue in starters repo: kedro-org/kedro-starters#38. So this is not necessarily just about hooks but rather kedro install needs to evaluate settings.py and any import there from third-party dependency will break the command.
I'm closing this issue as it is now solved in master as part of [KED-2768] Fix kedro install flow (#1203). Thank you @mzjp2 for the amazing write-up and please shout is you'd like to discuss this further!
Description
Pipline registry for hooks
All Kedro CLI commands go through
kedro/framework/cli/cli.py::KedroCLI
whose__init__
runsbootstrap_project
. Kedro'sbootstrap_project
reads the users project metadata and imports the usersettings.py
file.The users
settings.py
file (at least in modern Kedro versions) by default importsProjectHooks
from the usershooks.py
file.If users choose to register their pipelines via the
register_pipelines
hooks insideProjectHooks
class (or any other class insidehooks.py
) and import their pipelines at the module-level ofhooks.py
to use them insideregister_pipelines
, this triggers a chain import frombootstrap_project
that importssettings.py -> hooks.py -> imported_pipeline -> imported_pipeline/pipeline.py -> imported_pipeline/nodes.py -> third_party_library
.This is fine for pretty much all CLI commands (though can contribute to slowing CLI commands down for large projects that have many imports, for CLI commands that don't necessarily need the whole project imported to do something) since having the projects requirements installed is a sensible pre-requisite to running any of them (e.g
kedro run
).However, the exception to that is
kedro install
andkedro build-reqs
which are presumably written to help users install their requirements. That is, the starting position of using those commands is that the users don't have their project requirements installed.Third party hooks
If users want to register a third-party hook that doesn't register itself automatically, their settings.py looks something like:
which
kedro install
imports and runs and fails at ifthird_party_hook
isn't already installed.Steps to Reproduce
Put a
import tensorflow
orimport non_existent_library
indata_engineering/nodes.py
and then register the data engineering pipeline (or really just import it) at the top-level of yourhooks.py
and try runningkedro install
.Expected Result
kedro install
should execute and install the project requirements.Actual Result
kedro install
fails with an import error.Your Environment
Include as many relevant details about the environment in which you experienced the bug:
pip show kedro
orkedro -V
): Kedro 0.17.4python -V
): Python 3.7.11The text was updated successfully, but these errors were encountered: