Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make kedro-telemetry a core dependency #3976

Merged
merged 14 commits into from
Jul 9, 2024
Merged

Conversation

ElenaKhaustova
Copy link
Contributor

@ElenaKhaustova ElenaKhaustova commented Jul 1, 2024

Description

Partly solves kedro-org/kedro-plugins#726

Development notes

  • Removed kedro-telemetry dependency from the requirements.txt file in Kedro project template.
  • Modified the pyproject.toml file to include kedro-telemetry in the dependencies section.

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
@ElenaKhaustova
Copy link
Contributor Author

ElenaKhaustova commented Jul 1, 2024

Unit tests fail if kedro-telemetry is installed in the test environment. Uninstalling package solves the problem locally.

Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
@ElenaKhaustova ElenaKhaustova marked this pull request as ready for review July 1, 2024 15:23
@astrojuanlu
Copy link
Member

Key question from me: does this create a circular dependency? kedro depending on kedro-telemetry and viceversa

@ElenaKhaustova
Copy link
Contributor Author

Key question from me: does this create a circular dependency? kedro depending on kedro-telemetry and viceversa

Yes, that's what we obviously missed 🙈

@DimedS shall we consider moving kedro-telemetry to framework instead?

@DimedS
Copy link
Contributor

DimedS commented Jul 1, 2024

I would be happy to incorporate the kedro-telemetry code into Kedro, and I believe it should be done. However, I thought that at this stage, it would be a significant change, so I decided to start with a minor step. As a temporary solution, I don't see any major issues with this type of circular dependency; it essentially means that the code is part of one package. I don't anticipate any usage problems with this approach. Nevertheless, I would be happy to discuss whether we are ready to integrate the telemetry code into Kedro, as this is definitely the correct long-term solution.

@ElenaKhaustova
Copy link
Contributor Author

As far as I understand, pip indeed can deal with circular dependencies as long as they are both not build dependencies of each other - which is our case.
https://discuss.python.org/t/handling-installation-of-circular-dependencies/25531/8

But from the code organisation perspective, it feels wrong to me unless we have reasonable arguments for keeping telemetry in a separate package. Now (with recent changes of making telemetry opt-out), we expect the kedro-telemetry package to be installed anyway, regardless of whether telemetry is collected, so it doesn't reduce framework dependencies. On the other side kedro-telemetry always goes with kedro, which means they probably should live in the same code base, otherwise, we need to understand what advantages we get from keeping them separate.

@DimedS, I agree with your points on long-term solution and am curious what others think.

@DimedS
Copy link
Contributor

DimedS commented Jul 2, 2024

I also tested by creating circular dependencies in two libraries and uploading them to PyPI. Everything worked well, with no warnings during the upload to PyPI or while installing with pip. Based on this, I believe that for the next Kedro release, it would be better to integrate kedro-telemetry as a dependency of Kedro. For the subsequent major release, we can consider integrating the telemetry directly into Kedro, possibly in conjunction with splitting Kedro into different modules.

)
else:
click.echo("No plugins installed")
# plugin_versions now always include kedro-telemetry as it was moved to core kedro dependency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is better to leave the code as it is now because it's more generic and users still can manually uninstall the kedro-telemetry which will produce "No plugins installed", and I believe in some time in future we will integrate kedro-telemetry inside of kedro and will need to change that back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@astrojuanlu
Copy link
Member

I believe that for the next Kedro release, it would be better to integrate kedro-telemetry as a dependency of Kedro.

Fantastic, thanks 👍🏼

For the subsequent major release, we can consider integrating the telemetry directly into Kedro, possibly in conjunction with splitting Kedro into different modules.

I 100 % agree.

Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Copy link
Contributor

@DimedS DimedS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, @ElenaKhaustova , tested it, everything works well!

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to wait for a breaking release to move the kedro-telemetry code to Kedro? Currently, telemetry is not a dependency so adding the code into Kedro isn't a breaking change.

pyproject.toml Outdated
@@ -22,6 +22,7 @@ dependencies = [
"gitpython>=3.0",
"importlib-metadata>=3.6,<9.0",
"importlib_resources>=1.3,<7.0", # The `files()` API was introduced in `importlib_resources` 1.3 and Python 3.9.
"kedro-telemetry>=0.3.1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 0.3.1 an acceptable version or should we just up this to 0.5.0 so we get the latest with all improvements on identifying users and projects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO makes sense to update, indeed. @DimedS, wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe it makes sense to update it now, but it's more important to remember to update it with a new kedro-telemetry version after the next release of kedro-telemetry. This should be done after we complete all opt-out work, but before the Kedro release.

@merelcht merelcht requested a review from noklam July 4, 2024 12:49
@ElenaKhaustova
Copy link
Contributor Author

ElenaKhaustova commented Jul 4, 2024

Do we have to wait for a breaking release to move the kedro-telemetry code to Kedro? Currently, telemetry is not a dependency so adding the code into Kedro isn't a breaking change.

It's not a breaking change the only thing which is blocking it is some additional code restructuring that needs to be done before which @DimedS mentioned above #3976 (comment). Otherwise, we can proceed with moving.

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add this to the release notes!

Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I seen it in the discussion, it seems like circular dependencies wasn't a big problem, so I don't have problem with this approach.

I test it with uv to see what happens if I try to force an incompatible dependencies:

(kedro) gitpod /workspace/kedro (feature/telemetry-core) $ uv pip install -e . kedro-tele
metry==0.3.1
× No solution found when resolving dependencies:
╰─▶ Because only kedro==0.19.6 is available and kedro==0.19.6 depends on
kedro-telemetry>=0.5.0, we can conclude that all versions of kedro depend on
kedro-telemetry>=0.5.0.
And because you require kedro-telemetry==0.3.

  1. Do we have an issue to track the long term solution? Is it a breaking change or something that we can already implement in 0.19.x?

@ElenaKhaustova
Copy link
Contributor Author

As I seen it in the discussion, it seems like circular dependencies wasn't a big problem, so I don't have problem with this approach.

I test it with uv to see what happens if I try to force an incompatible dependencies:

(kedro) gitpod /workspace/kedro (feature/telemetry-core) $ uv pip install -e . kedro-tele
metry==0.3.1
× No solution found when resolving dependencies:
╰─▶ Because only kedro==0.19.6 is available and kedro==0.19.6 depends on
kedro-telemetry>=0.5.0, we can conclude that all versions of kedro depend on
kedro-telemetry>=0.5.0.
And because you require kedro-telemetry==0.3.

  1. Do we have an issue to track the long term solution? Is it a breaking change or something that we can already implement in 0.19.x?

It's not a breaking change and we agreed to make it before the rest of the modularisation.

Here is an issue created: #3999

Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
@ElenaKhaustova ElenaKhaustova enabled auto-merge (squash) July 9, 2024 15:05
@ElenaKhaustova ElenaKhaustova enabled auto-merge (squash) July 9, 2024 15:11
@ElenaKhaustova ElenaKhaustova merged commit ff4bbb5 into main Jul 9, 2024
41 checks passed
@ElenaKhaustova ElenaKhaustova deleted the feature/telemetry-core branch July 9, 2024 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants