-
Notifications
You must be signed in to change notification settings - Fork 34
Bug 1746375 - Dynamically add metric constructors to metric map #1065
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7b5d7a2 to
5f0bced
Compare
Build size report
|
badboy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the final bundle actually get built? Is this under control of the user embedding Glean?
Will this be determined by the metric types used in metrics.yaml?
It might be worth listing out when we lose what data.
E.g. something like
Version A: initial. metric types [string, bool, datetime]. Records all 3. No ping sent.
Version A+1: removed datetime. metric types [string, bool]. Sends ping, contains string and bool, old datetime is deleted from the database.
(Am I correct with that example?)
My first reaction is: I think this is fine then. Version A+1 wouldn't record a new value. In the current model we would get a value from a previous run, but no new ones anyway.
Losing that one-time stale value doesn't seem to dramatic.
So in the benchmarks example we are using webpack. We are not in control of how the user builds it, but the most common way to use Glean would be to use a bundler such as webpack or rollup. We also don't provide a pre-built library, so I would say that might be the only way to use it.
Exactly. The glean_parser generated template imports the metric types defined in the metrics.yaml and instantiates the metric. With this change, unless a given metric type's code is imported, none of it is added to the final bundle. Which was not the case before, because we were always importing at least the Metric part used for database deserialization.
Perfect, this is right.
Yeah I tend to agree with this, but I am working on assumptions here. I was wondering if you have seen such a pattern in other places. |
Actually in this case it would still report the datetime, because that metric is used internally by Glean and thus always added to the metric map. That is what I think is confusing about all this. It will only really apply to metrics not used internally by Glean. It also doesn't apply to Events, because they are not recorded on the metrics database. I guess docs help here though. |
ehehehe, pick a metric type that's not used by Glean then.
Bear with me, I really haven't used JS tooling in a long time. Trying to understand how a JS app is shipped.
Is my understanding correct?
Not really. glean-core always has all metrics (and we can't ever remove metric types really because our database serialization depends on it). |
No bearing required, your understanding is perfect. Yes, that is the crucial part of this PR.
Because it is just so damn easy for people to mess with the database data in the browser I think I'll stick to deleting the wrongly formatted data for now. There is a bug to think about this behaviour and it's problems more deeply in the future. (Fun fact: this is the oldest bug in Glean.js' backlog!). It might be worth instrumenting it in the future though. I'll keep that in mind.
In this PR I essentially add a possible code path for users to write data the database can't later decode. But from what I understand we are alright with that. Anyways, thanks a lot for the thorough discussion. I'll clean this up and promote it to a non-draft PR. I'll make sure to add some docs here or in a later PR capturing this discussion. |
5f0bced to
4278377
Compare
yeah, I think the fact that glean core is not deleting data is a bug.
Agreed. Docs can help here, but I also think it's not even that surprising to users: you delete your metrics -> data's gone. |
This is nice because it means metrics will work exactly like plugins in the sense that they are only added to the bundle if they are used. I am not entirely sure about this change, because it means that if a given metric is in the database, but was not added to the metric map (think if a metric is removed from the code, but has not been collected in a ping once the new code lands) it will be read as invalid by Glean.
Previous behaviour was to delete the whole ping when that happened. This is a nice change either way, we should always delete as little data as possible. However this becomes even more necessary if we decide to go ahead with the changes in the previous commit.
Co-authored-by: Jan-Erik Rediger <badboy@archlinux.us>
8ab83d0 to
9ba21fc
Compare
MUST REBASE DEPENDABOT PRS ALWAYS
Co-authored-by: Jan-Erik Rediger <badboy@archlinux.us>
Sending this in as a draft because I would like to get some feedback on the approach before moving on. More details on the commit messages, but this is a part of attempting to make the Glean.js bundle size smaller.
The changes do not make as a drastic a dent in the size of the bundle as #1051, however these changes are guards against making the bundle much larger in the future. As we add more metric types the bundle wouldn't get bigger.
My main concern is outlined in aaa1ce0.
This could cause some surprising behaviour for users. My question is: are the benefits of a smaller bundle worth this kinda weird, but hopefully tiny cornercase?
I have also filed a bug so we can instrument whenever we find invalid data in the database so we can know if this is really tiny or not.
Note that 7b5d7a2 is an attempt to make the cornercase not as big.
Pull Request checklist
glean/folder, run:npm run testRuns all testsnpm run lintRuns all lintersCHANGELOG.mdor an explanation of why it does not need one