-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/replace identity in kedro telemetry with unique username #58
Feat/replace identity in kedro telemetry with unique username #58
Conversation
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
b693dd8
to
02187f1
Compare
The CI config has been modified and it works fine, but this is very hard to reason. I think it makes sense to refactor this into
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation looks good to me 🙂
On the point of grouping all "anonymous" events together: I think we should keep the same behaviour as before where if we can't determine the username we don't send any data at all. You've already switched to getpass.getuser()
which is more robust then os.getlogin()
afaik. But it would be good to hear what @yetudada thinks about this.
@@ -128,19 +127,6 @@ def test_before_command_run_no_consent_given(self, mocker, fake_metadata): | |||
|
|||
mocked_heap_call.assert_not_called() | |||
|
|||
def test_before_command_run_socket_timeout(self, mocker, fake_metadata): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed now you removed the timeout logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation looks good to me 🙂
On the point of grouping all "anonymous" events together: I think we should keep the same behaviour as before where if we can't determine the username we don't send any data at all. You've already switched to
getpass.getuser()
which is more robust thenos.getlogin()
afaik. But it would be good to hear what @yetudada thinks about this.
So I think following standup, it should be that if we can't determine the username
they should get the unique Heap ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, I made this change in 7181e4d.
As I understand it from the heap docs, if the request is made with no identity
field, heap automatically assigns the unique ID. In the case that no username can be found, the identity
field is removed from the request.
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍 Don't forget to update the release notes with this change.
Pasting our DM here for clarity: Regarding the telemetry ticket, your implementation looks good. Heap automatically assigns a User ID for each user and pools those users who have the same Identity into that User ID profile. So, I believe what Yetu meant in the case where we can't determine the Identity, we shouldn't set anything to Identity including "anonymous" and just let heaps automatic User ID pick it up. This seems to be the recommended method that heap suggests: https://developers.heap.io/docs/using-identify#identify-pitfalls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Description
Resolves #50
This PR assigns the hashed username of the Kedro user to the
identity
field in calls to the Heap API. This allows heap to aggregate data collected by the same user. A defaultusername
of 'anonymous' is assigned in the case that it cannot be determined. Currently, in this case, theidentity
is also set to 'anonymous'. This would cause Heap to pool together all 'anonymous' identities; it is unclear if this meets requirements, some feedback here would be good.Development notes
The user's username is collected in before_command_run, using
getpass.getuser()
(the best cross-platform choice according to thepwd
docs). It is then passed to_format_user_cli_data
and_send_heap_event
. Tests have been changed to reflect this.Checklist
RELEASE.md
file