Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make cookie writes async by default to improve tracker performance #1340

Merged
merged 14 commits into from
Oct 3, 2024

Conversation

matus-tomlein
Copy link
Contributor

@matus-tomlein matus-tomlein commented Sep 3, 2024

This PR makes cookie writes async by default on the browser tracker in order to improve performance of the tracker call (to block the app for a shorter period of time when an event is tracked).

It adds a wrapper around the calls to read and write cookies that caches the last written value and uses setTimeout with 10ms to schedule the cookie write (it also debounces if there are multiple cookie writes in short sequences).

There is also a new option to keep the previous behaviour and write cookies synchronously with the track calls – the synchronousCookieWrite option. This may be useful in tests and it also has the benefit of checking that the cookies were correctly set.

I added an e2e test that compares the performance of the different storage strategies and the sync option, results below.

image Numbers 2024-09-03 13 53 01

@matus-tomlein matus-tomlein changed the base branch from master to release/4.0.0 September 3, 2024 12:12
Copy link

bundlemon bot commented Sep 3, 2024

BundleMon

Files added (6)
Status Path Size Limits
trackers/javascript-tracker/dist/sp.js
+24.24KB 30KB / +10%
libraries/browser-tracker-core/dist/index.mod
ule.js
+23.41KB 25KB / +10%
libraries/tracker-core/dist/index.module.js
+19.12KB 20KB / +10%
trackers/browser-tracker/dist/index.umd.min.j
s
+17.24KB 20KB / +10%
trackers/javascript-tracker/dist/sp.lite.js
+17.18KB 20KB / +10%
trackers/browser-tracker/dist/index.module.js
+3.49KB 5KB / +10%

Total files change +104.68KB 0%

Final result: ✅

View report in BundleMon website ➡️


Current branch size history

Comment on lines 83 to 90
setCookieTimer = setTimeout(() => {
setCookieTimer = undefined;
cookie(name, value, ttl, path, domain, samesite, secure);

if (ttl !== undefined) {
scheduleClearCache(ttl);
}
}, debounceTimeout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just thinking... Why aren't we using the requestAnimationFrame and setTimeout trick here?

requestAnimationFrame(() => {
    setTimeout(() => {
       cookie()...
    }, 0);
  });

Since the time is random it might again hit a blocking point when rendering would be critical and INP metrics might still be an issue. requestAnimationFrame is a bit safer on that aspect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the race risk is already pretty high with this change, adding indeterminate amounts of time before updating the value seems dangerous to values like eventIndex.

Copy link
Contributor

@jethron jethron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work ok with things like link/form tracking that navigate away from the page? I feel like a whole new task to update the cookie might break some of the data consistency we have in place in those cases, particularly stuff in client_session.

Comment on lines 83 to 90
setCookieTimer = setTimeout(() => {
setCookieTimer = undefined;
cookie(name, value, ttl, path, domain, samesite, secure);

if (ttl !== undefined) {
scheduleClearCache(ttl);
}
}, debounceTimeout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the race risk is already pretty high with this change, adding indeterminate amounts of time before updating the value seems dangerous to values like eventIndex.

@matus-tomlein
Copy link
Contributor Author

Thanks both for the reviews!

@jethron I have added a flushPendingCookies() function to the tracker that forces all cookies to be written synchronously. I have added a call for this function in the link click tracking plugin and in the form tracking for form submit events. It's not a great solution (don't really like to add complexity and configuration options), but I think it can help if we identify the calls where cookies need to be written synchronously and force flush the cookies there.

It's a trade-off between performance and correctness that we have to make. In my opinion, the performance gains make it worth it in this case. Also I don't see another way to speed up the cookie writes without making them async, but happy to make changes if you have some other idea that would be safer!

@jethron
Copy link
Contributor

jethron commented Oct 1, 2024

Sorry for the delay on this @matus-tomlein!

I did some testing and did find a race, but then when I compared it to v3 the behaviour there had a similar problem anyway, so I guess it doesn't really count as a regression -- it's just a long-standing bug. So I don't think that should block this.

The only other difference I really noted was that event "bursts" no longer automatically batch; the performance improvement kind of means every event gets sent individually now where in some cases in v3 because of the contention on the cookies those events would all be bundled together in a single payload, even if your batch size was 1 (the default). This isn't a big deal IMO, but good to note that the behaviour will be different now. This is probably due to fetch changes more than anything, actually.

e.g. in below test v4 generates 20-30 requests but v3 generates more like 10-15 for similar numbers of events.


To test I had the same page open in two on-screen browser windows (important, just different tabs would throttle intervals) and ran the following snippet at roughly the same time (it tries to give some leeway with an initial timeout) in each via browser console:

let start = Date.now();
setTimeout(function(){
  let i = setInterval(function(){
    if ((Date.now() - start) > 3000) clearInterval(i);
    spActivity('trackPageView')}, 100)
}, (+new Date()) % 1000);

(spActivity because I was using trackers/javascript-tracker/test/pages/activity-callback.html - I had to modify it to enable the session context - but any page should work as well with normal snowplow).

Most events are OK, but when there's a race like this enforces, eventually you get collisions, where between the two pages events with different event IDs/payloads will generate with the same eventIndex value (which is extracted from the cookie) and this then means all the following events are technically wrong as well.

v3: eventIndex 933 generated twice
image
image

v4: eventIndex 975 generated twice
image
image

Copy link
Contributor

@jethron jethron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Races in generated data, but similar problems already exist in v3 so not a dealbreak.

Slight changes to batch behaviour but acceptable for the performance improvement IMO.

Otherwise LGTM, guess we'll find out if there's anything else we missed. :)

@jethron
Copy link
Contributor

jethron commented Oct 1, 2024

FML, that was testing release/4.0.0, not your actual branch. This branch does actually seem a bit worse, the above seems to generate duplicate events way more often, even without much of a race involved. :/

@matus-tomlein matus-tomlein force-pushed the issue/cookie_optimization branch from d4ecc6a to ae249a8 Compare October 1, 2024 11:10
@matus-tomlein
Copy link
Contributor Author

Thanks @jethron for testing that! That's a good test case since it's quite possible that users will open the same website in multiple windows.

I think the problem is not so much in the async cookie writes but in the fact that we now cache the cookie values. Once a cookie value is cached, we don't check for the actual cookie value which may be modified by another window.

I have now made a change to add a maximum lifetime of the cached cookie value to 100 ms. I have run the same test as you have and while I still get conflicts on the eventIndex, the events now can recover after a conflict which was not the case before – before once the caches got out of sync, they would never get back in sync.

We can lower the max cache lifetime further, hard to say what should be the setting... Happy to go with a different value if you prefer.

@matus-tomlein matus-tomlein merged commit baec2a9 into release/4.0.0 Oct 3, 2024
3 checks passed
@matus-tomlein matus-tomlein deleted the issue/cookie_optimization branch October 3, 2024 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants