Description
Describe the user story
As maintainer, it's sometimes difficult to know what we should prioritize. Are large monorepos the most common situation our users encounter? What packageExtensions are the most common? How many people opted-out to the nm linker? Etc.
Because of the lack of analytics, some projects also have trouble taking us seriously. A thread in the Node docker image recently suggested to remove Yarn from the Docker image, citing Yarn as a fringe tool. I don't have time to spend collecting the various polls from the surface of the earth.
Describe the solution you'd like
I propose we implement opt-out telemetry.
Homebrew is an OSX package manager with some level of analytics (they actually log more than what I have in mind for us: https://docs.brew.sh/Analytics).
-
Users would be anonymous. We wouldn't implement "client IDs".
-
Data would be stored on a third-party we don't own. In our case, something like Google Analytics would be perfect.On this point, I've investigated a bit Google Analytics and I'm not sure it's an option. The dashboards are very bare, and it doesn't seem to have good support for arrays, which would be necessary to support plugin and command names, unless we split it across dozens of calls. Perhaps Datadog would be a better fit after all. -
Events would be aggregated, and sent weekly. We wouldn't be able to track anything with a lower granularity. As a result, telemetry wouldn't have any effect on CI.
-
Information about telemetry would be displayed on first install, together with a link explaining it in more details. Documentation would include a new page describing it.
-
A new
yarn analytics off
would disable it from all projects on the machine (on
would re-enable it). Runningyarn analytics show
would print the information that would be sent. -
The payload would be sent only during installs (not during
run
or anything else), in parallel with the regular install workflow (so it shouldn't have any significant overhead). Connectivity failures would be ignored and not cause installs to fail. -
The information I propose we would track:
- The Yarn version
- Which command name is used (but not its arguments)
- The active plugin names (only for our own plugins)
- The number of installs run during the week
- The number of different projects having been installed
- How many installs for the nm linker
- The number of workspaces
- The number of dependencies
- The
packageExtensions
field (name of extended + name of the extra dependency)
Describe the drawbacks of your solution
Telemetry is seen with an understandable amount of caution. Not helping, the project was once associated with Facebook, and it will be important to remind users that we don't have any particular link with it anymore. Using a third-party provider (such as Google Analytics) will also be a good way to guarantee that we don't collect unlisted data (such as IP, etc).
Describe alternatives you've considered
We could do without telemetry. Unfortunately, I think the lack of consideration we get from some entities is caused at least in part by the lack of metrics we can show them (helping us will have impact on X thousands of developers). Not having those tools require us to put more work into convincing them, which is exhausting.