Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify KedroContext with frozen attributes instead of frozen class #3300

Merged
merged 12 commits into from
Nov 21, 2023

Conversation

noklam
Copy link
Contributor

@noklam noklam commented Nov 13, 2023

Signed-off-by: Nok nok.lam.chan@quantumblack.com

Description

Fix #1459. Initially we plan to freeze the class to prevent users changing the internal details. After some discussion, this can be too restrictive and we decide to only freeze existing attributes. Users are free to create their own as long as it doesn't conflict with the internal one.

Development notes

For Reviewer

@define(slots=False)  # Enable setting new attributes to `KedroContext`
class KedroContext:
    """``KedroContext`` is the base class which holds the configuration and
    Kedro's main functionality.
    """

    _package_name: str = field(init=True, on_setattr=frozen)
    project_path: Path = field(
        init=True, converter=_expand_full_path, on_setattr=frozen
    )
    config_loader: AbstractConfigLoader = field(init=True, on_setattr=frozen)
    _hook_manager: PluginManager = field(init=True, on_setattr=frozen)
    env: str | None = field(init=True, on_setattr=frozen)
    _extra_params: dict[str, Any] | None = field(
        init=True, default=None, converter=deepcopy, on_setattr=frozen
    )

Looking at the current interface, should we re-order the argument as it seems arbitary, (and) maybe make it keyword only.

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
@noklam noklam linked an issue Nov 13, 2023 that may be closed by this pull request
3 tasks
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
@noklam
Copy link
Contributor Author

noklam commented Nov 13, 2023

@astrojuanlu I think this should work, however I am stuck with a testing issue. There are two problems

  1. Mock doesn't work well with autospec, I've managed to find a workaround to automatically generate the spec.
  2. https://github.com/quantumblacklabs/private-kedro/pull/975 - dynaconf (the library that we used automatically delete the attributes)
mock_cls = create_attrs_autospec(KedroContext)

If you inspect this object you should see that all attributes are there, yet if you look at the test it complains

Input: "Mock object has no attribute 'env'". The issue that I link seems to suggest that this is a known issue.

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
@noklam
Copy link
Contributor Author

noklam commented Nov 14, 2023

I manage to fix the test.

@Galileo-Galilei Any chance you can test with this branch?

@Galileo-Galilei
Copy link
Contributor

I'll do it tonight

@merelcht
Copy link
Member

Generally this looks good to me. In the description you asked:

Looking at the current interface, should we re-order the argument as it seems arbitary, (and) maybe make it keyword only.

I 100% agree this order looks arbitrary now, especially because there's a mix of "internal"/"private" arguments starting with _ and not. I think now would be the right moment to re-order, because it's a breaking change.

@Galileo-Galilei
Copy link
Contributor

Galileo-Galilei commented Nov 19, 2023

Hi @noklam and thank for adressing it! I haven't looked at the code in detail yet, but I have tested the behaviour and it works as expected:

  • The attributes are indeed frozen, and I can add attributes (just with context.next_attr="value", not need to use a workaround with object.__setattr__)
  • kedro-mlflow does work with this version 👍

I agree that it is likely time to reorder all arguments to make clearer which one are public.

However, I still have more generic concerns on whether all (or even any) attributes should be frozen. The after_context_hook is particularly useful to distribute the logic without relying on the template, which makes it much easier to update and maintain. For instance, we have a plugin at work which register a custom ConfigLoader (basically because we want to fetch the conf from another place, but we want to benefit from all the capabilities of kedro for resolving - related to #2481, but this is another story). If we cannot modify it through the hook, we will need to tell our users to update their templates on their which is basically a nightmare in maintenance.

I can also think of use cases in which I want to update the _extra_params dictionary, but since the underlying object is mutable, context._extra_params.update({"new_param":"value"}) works.

By order of preference:

  1. Do not freeze anything
  2. Freeze all attributes except ConfigLoader
  3. Stay as is (but I am afraid that it will break some of our internal workflows at work, I need to check but I am not sure I could make it before the release)

@noklam
Copy link
Contributor Author

noklam commented Nov 20, 2023

I have decided to not freeze any attribute at the end. Choosing some attributes to freeze and not others feels arbitrary to me, let's just fallback to Python convention with _. On the other hand, we should still respect that this is something internal to Kedro, so this is a "do it at your own risk", but of course we should't break it easily without offering a valid alternative.

Additionally, I re-arrange the order of the argument. Noted that package_name will be removed soon so it's irrelevant.

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
@noklam noklam requested a review from merelcht November 20, 2023 09:11
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Galileo-Galilei for testing and explaining your pain points. This is really insightful. I agree with the approach to not freeze anything in the end @noklam 👍

I was just wondering if you then still need all the tests for checking attributes?

RELEASE.md Outdated Show resolved Hide resolved
@@ -158,18 +158,38 @@ def _expand_full_path(project_path: str | Path) -> Path:
return Path(project_path).expanduser().resolve()


@frozen
@define(slots=False) # Enable setting new attributes to `KedroContext`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed now we're not freezing anything?

Copy link
Contributor Author

@noklam noklam Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just wondering if you then still need all the tests for checking attributes?

It is needed, and for this reason I keep one of the test. If you try to set slots=True (or remove this argument since it's default), you will see the test fail. The default for standard Python library is False

https://www.attrs.org/en/stable/api.html#attrs.define - The reason for this is slotted class is more efficient and make sense if it is used as "dataclass" (think of pydantic model that get convert to JSON etc), in this case it's irrelevant

tests/framework/session/test_session.py Outdated Show resolved Hide resolved
tests/framework/session/test_session.py Outdated Show resolved Hide resolved
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @noklam for revisiting this change, good thing we didn't release it the way it was 👍

Copy link
Contributor

@ankatiyar ankatiyar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @noklam 😄

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
@noklam noklam enabled auto-merge (squash) November 21, 2023 16:30
@noklam noklam merged commit f285f2c into develop Nov 21, 2023
36 checks passed
@noklam noklam deleted the noklam/revisit-make-kedrocontext-1459 branch November 21, 2023 16:50
@Galileo-Galilei
Copy link
Contributor

Thank you everyone and especially @noklam for this change, and sorry for not having been as available as I had wanted to to help for this!

stichbury pushed a commit that referenced this pull request Nov 28, 2023
…3300)

* modify release note

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* Make KedroContext partial frozen

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* fix KedroContext and add test for public and internal attributes

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* Fix tests

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* Rearrange the `KedroContext` arguments

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* improve test_set_new_attribute

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* Unfreeze the Kedrocontext attributes

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* clean up

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* remove redundant test

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

---------

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Revisit: Make KedroContext a dataclass and add config_loader as a property
4 participants