Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create high-level user/project statistics on Heap #36

Closed
yetudada opened this issue Jun 9, 2022 · 1 comment
Closed

Create high-level user/project statistics on Heap #36

yetudada opened this issue Jun 9, 2022 · 1 comment
Assignees

Comments

@yetudada
Copy link
Contributor

yetudada commented Jun 9, 2022

Description

Product usage analytics helps us understand how Kedro is used. This information helps us determine if we have succeeded in developing certain features and gives us a guiding point for identifying if we must improve our approach.

We shipped the first version of Kedro-Telemetry to understand the usage of the CLI and Kedro-Viz. However, we're still missing some high-level information like:

  • How many Kedro users do we have? The research question is, "How many users identified by username have run at least one CLI command?"
  • How many users of Kedro-Viz do we have? The research question is, "How many users identified by username have opened Kedro-Viz or run the kedro viz CLI command?"
  • How many users of Kedro-Viz experiment tracking do we have? The research question is, "How many users identified by username have opened Kedro-Viz and have opened the runsList or experiment-tracking pages on Kedro-Viz?"
  • How many projects are using Kedro? The research question is, "How many projects identified by project_name have had someone run at least one CLI command from that project?"

All of these values assume that Kedro-Telemetry is installed and activated according to our consent-based workflow.

Context

Some of the complexities of why it is difficult to do this might lie in defining user identities on Heap.

There are two types of properties that Heap recognises, user and event. Properties are bits of metadata that are captured during user interactions with the application. User properties refer to any data related to the user. In contrast, event properties are metadata associated with any actions the user takes.

Here is what I have observed:

  • On the Heap, a User ID is created, and it is unknown if this field has a 1-to-1 mapping to our username field collected from Kedro-Telemetry. This field generates built-in charts on the Number of Users.
  • And on our side, we send all of our user properties like username or even project_name as event properties when they might be user properties.

Possible Implementation

There are two parts to this:

  1. How do we have a consistent user identifier on Heap? Can we use the User ID field? Can we send username to replace User ID?
  2. How do we make it possible to create a summative view of projects on Heap? Which may have to look at adding project_name or another project identifier to user properties.

Re: Point 2. This would require some discussion about what is a user. Is a user consistently defined by their username or is a user a username AND project_name.

@merelcht
Copy link
Member

This was completed in #50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants