Prebid Cache Cross Data Center Lookup #1620

SyntaxNode · 2020-12-09T21:45:28Z

This is a follow-up to #1562 to focus on the situation where a host has multiple Prebid Cache data centers which do not sync with each other and the end user is directed to a different data center for the PUT and GET requests.

Summary

Prebid Cache provides hosts with the ability to configure a variety of different backend storage systems. These storage systems may run in an isolated state or sync with each other. Due to the large amount of data retrieved shortly after being written and the low chance of a cross data center lookup, many hosts including Xandr and Magnite do not sync their data center caches. As @bretg mentioned, it would be impossible (or at least prohibitively expensive) to try to replicate caches of this size globally within milliseconds.

We have not seen evidence of widespread issues with this setup that's been in place for many years, but there are a number of community reports which indicate otherwise. I'd like to begin our investigation by measuring the rate of occurrence to determine if we need to build a solution.

Proposal

Include a new feature for Prebid Cache to determine if a GET request is for a PUT request handled by a different data center. I see two options:

Accept a new query parameter for the GET request which is set the hb_cache_host targeting key via macro resolution. I believe this would be the cleanest solution, but I recognize it requires action to be taken by the publishers. I'm hopeful publishers suspecting this is an issue would be willing to assist in collecting metrics.
Encode the data center into the already automatically generated cache id. Some Prebid Cache calls provide their own cache keys which obviously wouldn't work, but that use case is likely small enough that we can still collect enough metrics.

Thoughts?

bretg · 2020-12-11T16:06:37Z

Discussed in PBS committee

PBC does have a read miss metric but it doesn't distinguish between different reasons like timeout, bad UUID, or wrong datacenter. However, Magnite sees only about 1% read-miss rate, so this doesn't appear to be a major problem.

We don't particularly like any of the available measurement solutions, so at this time we're proposing to adopt a wait-and-see approach. If the community has data that shows a more concrete problem, please post it to this issue.

spormeon · 2020-12-12T14:17:03Z

if you hit the LB'er and go to datacentre 3, instead of 2, what metric is collected there? none? No pubs got anyway to test this, they cant hit the server ip behind the Lb'er, the only thing there going to know if % discrepancies between imps they thought they had V what is/ was "paid" recorded in back systems, there juts going to "take it on the chin" as loss. Oh well 10%, 20%, 5% difference, what can I do?

bretg · 2020-12-12T14:47:44Z

if you hit the LB'er and go to datacentre 3, instead of 2, what metric is collected there?

We would be seeing cache read misses on datacenter 3. We're not.

Are you actually seeing 20% discrepancy between Prebid line items delivered (bids won) and video impressions? If that's the case, then would you be willing to update your ad server creatives to add another parameter?

bretg · 2021-01-08T14:43:10Z

We still don't have evidence that this is a problem, but I'll move the ball forward by proposing a relatively small feature based on SyntaxNode's first proposal above:

Accept a new query parameter for the GET request which is set the hb_cache_host targeting key via macro resolution

support a new "ch"(cache host) parameter on the /cache endpoint
http://HOST_DOMAIN/cache?uuid=%%PATTERN:hb_uuid%%&ch=%%PATTERN:hb_cache_host%%

the hb_cache_host is set by PBS to the actual direct host name of the cache server

         "hb_cache_host": "pg-prebid-server-aws-usw2.rubiconproject.com:443",

when PBC receives requests with the 'ch' parameter, it's validated and processed

a) if the hostname portion is the localhost, then cool, end-of-line. Look up the uuid as normal.
b) otherwise, verify that the named host is acceptable. We are not an open redirector. e.g. configure a regex in PBC that ensures that all ch values conform to *.hostdomain.com
c) if the host is ok, proxy the request but remove the ch parameter. One hop only. No chains allowed. Add the other pieces of the URL as needed -- the "https" protocol, the URI path, and the uuid parameter.
- when the response comes back, log a metric: pbc.proxy.success or pbc.proxy.failure
- return the value to the client
e) if the host did not match the regex, just ignore the ch parameter. Look up the uuid as normal.

patmmccann · 2021-06-04T12:05:02Z

Fwiw, a 1% read miss rate seems like a rather substantial problem to me.

bretg · 2021-06-04T13:08:24Z

read misses can come from late or late-and-duplicate requests as well as wrong datacenter.

anyhow, appreciate the kick here -- this had dropped off our radar, put it back in the stack of tickets to get done this summer.

stale · 2022-01-08T22:58:01Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

bretg · 2022-12-12T20:44:00Z

This was partly released with PBC-Java 1.13, but there's an outstanding bug where most requests get 'Did not observe any item or terminal signal' errors

SyntaxNode mentioned this issue Aug 20, 2021

large amount of empty Video UUID caches for Appnexus #1562

Closed

stale bot added the stale label Jan 8, 2022

SyntaxNode added the Intent to implement An issue describing a plan for a major feature. These are intended for community feedback label Jan 10, 2022

stale bot removed the stale label Jan 10, 2022

SyntaxNode added Intent to implement An issue describing a plan for a major feature. These are intended for community feedback and removed Intent to implement An issue describing a plan for a major feature. These are intended for community feedback labels Jan 10, 2022

SerhiiNahornyi mentioned this issue Mar 29, 2022

Support cache host parameter prebid/prebid-cache-java#46

Merged

bretg added projectboard and removed projectboard labels Sep 8, 2022

bretg mentioned this issue Apr 19, 2024

Support GET on the /vtrack endpoint #3629

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prebid Cache Cross Data Center Lookup #1620

Prebid Cache Cross Data Center Lookup #1620

SyntaxNode commented Dec 9, 2020

bretg commented Dec 11, 2020

spormeon commented Dec 12, 2020

bretg commented Dec 12, 2020 •

edited

Loading

bretg commented Jan 8, 2021 •

edited

Loading

patmmccann commented Jun 4, 2021

bretg commented Jun 4, 2021

stale bot commented Jan 8, 2022

bretg commented Dec 12, 2022 •

edited

Loading

Prebid Cache Cross Data Center Lookup #1620

Prebid Cache Cross Data Center Lookup #1620

Comments

SyntaxNode commented Dec 9, 2020

Summary

Proposal

bretg commented Dec 11, 2020

spormeon commented Dec 12, 2020

bretg commented Dec 12, 2020 • edited Loading

bretg commented Jan 8, 2021 • edited Loading

patmmccann commented Jun 4, 2021

bretg commented Jun 4, 2021

stale bot commented Jan 8, 2022

bretg commented Dec 12, 2022 • edited Loading

bretg commented Dec 12, 2020 •

edited

Loading

bretg commented Jan 8, 2021 •

edited

Loading

bretg commented Dec 12, 2022 •

edited

Loading