-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(telemetry): Single project identifier #701
Conversation
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do a full code review but I think this is mostly in line with what's expected for #507.
However, I'm still observing a weird behavior for packaged projects, and since this PR introduces a change that can modify pyproject.toml
files unrelated to Kedro, I opened kedro-org/kedro#3902 in case it's something we want to look at more broadly. I'm fearing that this might lead to false positives (like: the user runs a packaged Kedro project, which does not have a project_uuid
, but since there happens to be a pyproject.toml
file there, we get a false UUID). Would love to get opinions on that one before merging this PR.
One last thought: I'm assuming that we will introduce a change in kedro new
so that projects have this project_uuid
on pyproject.toml
right after creation. Any objections/thoughts?
I understand your arguments on modifying Theoretically we don't need |
I didn't immediately understand what @astrojuanlu meant with the That being said, I'd prefer it if we can ensure that the |
@merelcht, @astrojuanlu, thank you for sharing your thoughts. Based on them, I'll add logic to differentiate Kedro-related |
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a minor comment, but other than that this now looks good to me and works as I expected 🙂
return project_id | ||
except KeyError: | ||
logging.debug( | ||
f"Failed to retrieve project id or save project id: {str(pyproject_path)} does not relate to Kedro" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can make this a bit more specific by mentioning that the file doesn't contain a tool.kedro
section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with 1 comment ⭐
project_id = pyproject_data["tool"]["kedro_telemetry"]["project_id"] | ||
except KeyError: | ||
project_id = uuid.uuid4().hex | ||
toml_string = ( | ||
f'\n[tool.kedro_telemetry]\nproject_id = "{project_id}"\n' | ||
) | ||
file.write(toml_string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with @merelcht above
In fact, I'd argue that the
project_id
should be added to[tool.kedro]
and not in an entirely new section for kedro-telemetry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I 100% agree, there are two reasons why I added it to the separate section for now.
The first is because toml.dump
changes formatting like so:
[project.optional-dependencies]
docs = [
"docutils<0.18.0",
"sphinx~=3.4.3",
"sphinx_rtd_theme==0.5.1",
"nbsphinx==0.8.1",
"sphinx-autodoc-typehints==1.11.1",
"sphinx_copybutton==0.3.1",
"ipykernel>=5.3, <7.0",
"Jinja2<3.1.0",
"myst-parser~=0.17.2",
]
->
[project.optional-dependencies]
docs = [ "docutils<0.18.0", "sphinx~=3.4.3", "sphinx_rtd_theme==0.5.1", "nbsphinx==0.8.1", "sphinx-autodoc-typehints==1.11.1", "sphinx_copybutton==0.3.1", "ipykernel>=5.3, <7.0", "Jinja2<3.1.0", "myst-parser~=0.17.2",]
Which we do not like but I can replace it with custom file parsing to keep it proper.
- The main one is because we validate expected keys in the
[tool.kedro]
section and raise an error in case of mismatch: https://github.com/kedro-org/kedro/blob/27f5405cefd6701ffac4c6243030486fb7d3c942/kedro/framework/startup.py#L107
So, to make it properly, we need changes on the framework side: possibly updating kedro new
, as we discussed above. The question is whether we want to merge it like this for now or wait until we apply updates on the side of the framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh that "found unexpected keys" indeed would make any extra key in [tool.kedro]
unfeasible for older Kedro versions. I don't have an opinion right now on whether it makes sense to allow more keys going forward, but it would not be a solution for our case due to lots of old Kedro projects out there in the wild, I think.
About style preservation, I'm wondering if tomlkit is a better choice here, but again this would introduce an extra dependency for very little gain.
Waiting for @merelcht's signoff but I'm okay moving forward with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ElenaKhaustova thanks for explaining! That's unfortunate, but in that case I'd also be in favour to just merge it like this. It's more of a "nice to have" than a must anyway.
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
* Moved pyproject config name to constant Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Implemented _get_or_create_project_uuid Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Refactored _get_project_properties Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Fixed tests from hanging Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Fixed _is_known_ci_env Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Fixed test_before_command_run Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Fixed TestKedroTelemetryCLIHooks Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Fixed TestKedroTelemetryProjectHooks Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Fixed writing to pyproject.toml Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Fixed write mock Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Removed debug output Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Refactored _add_tool_properties Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Added debug message when pyproject_path does not exist Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Updated release notes Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Return None as project UUID in case of not generated Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Fixed pre-commit errors Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Updated the way project UUID is stored Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Renamed project_uuid -> project_id Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Check if pyproject file relates to kedro Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Changed debug message as suggested Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Added OSError handling Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> * Fixed unit test Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> --------- Signed-off-by: Elena Khaustova <ymax70rus@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com>
Description
FIX #507
Related docs update kedro-org/kedro#3897
Development notes
Manually tested the following scenario:
project id
does not exist - it is generated and saved inpyproject.toml
project id
exists - it is loaded frompyproject.toml
, joined with thepackage name
, hashed and usedproject id
exists - it is loaded frompyproject.toml
, joined with the newpackage name
, hashed and usedpyproject.toml
- works without errorspyproject.toml
-project id
is generated and saved inpyproject.toml
pyproject.toml
-project id
is NoneChecklist
RELEASE.md
file