-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cache support #35
Add cache support #35
Conversation
if err != nil { | ||
respondWithErrorPage(w, err) | ||
return | ||
} | ||
|
||
// If cache invalidation headers where set from cache, and the response is 304, we can return | ||
// the cached page | ||
if cacheInvalidationHeadersSetFromCache && fetchedWeb3Url.HttpCode == 304 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still want to return it from the cache if the request header has If-None-Match
field and cacheInvalidationHeadersSetFromCache == false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the request header has If-None-Match
(and so cacheInvalidationHeadersSetFromCache
= false), then we want to return the unmodified 304 : in this scenario, the web client has already loaded the page previously, and has stored it in his local browser cache : there is no need to send again the body from the web3url-gateway cache.
So, let's say:
- client A request for path P, which returns an ETag. web3url-gateway caches it, and the browser of client A cache it too.
- client A request again path P, with
If-None-Match
. Because there isIf-None-Match
, web3url-gateway knows that client A has a local copy of the page, so we forward the call to web3protocol-go, and if the result is 304, we can forward it. --> Here, web3url-gateway act as a transparent proxy. - client B request path P. Because it doesn't have it in his browser cache, it doesn't send a
If-None-Math
. web3url-gateway see this, so it will act as the cache, and inject anIf-None-Match
before forwarding the call to web3protocol-go. If web3url-gateway see that the result is 304, because we manually injectedIf-None-Match
, we know client B doesn't have it in his browser cache, so we substitute the 304 response by the web3url-gateway cache. --> Here, web3url-gateway act as a proxy injecting a cache layer.
Final note : with the implementation of ERC-7774, the calls to web3protocol-go don't trigger RPC calls (for websites implementing ERC-7774, and websites not updated), so there is no need to try to avoid web3protocol-go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the case of "the request header has If-None-Match
" is handled automatically by HTTP server so we do nothing about it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes ; in this scenario, the browser client has its own browser cache, so we only act as a transparent proxy, and we let web3protocol-go and the browser client discuss.
We can see web3url-gateway as a proxy which tries to be helpful : if it sees that the browser client does not have a cached version, but web3url-gateway has, then it tries to make use of his cached version : it does some "man-in-the-middle" by injecting an header on the request, and will send his cached version back to the browser client if web3protocol-go indicates 304 no changes.
if err != nil { | ||
respondWithErrorPage(w, err) | ||
return | ||
} | ||
|
||
// If cache invalidation headers where set from cache, and the response is 304, we can return | ||
// the cached page | ||
if cacheInvalidationHeadersSetFromCache && fetchedWeb3Url.HttpCode == 304 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the case of "the request header has If-None-Match
" is handled automatically by HTTP server so we do nothing about it, right?
log.WithFields(logFields).Infof("Added page cache entry for %s", web3Url) | ||
// If we got a HTTP 200 code, we don't cache the page, there was previously a cache entry, | ||
// and the cache entry was of type PageCacheEntryTypeHttpCaching, we remove it from the cache | ||
} else if fetchedWeb3Url.HttpCode == 200 && cacheEntryPresent && cacheEntry.Type == PageCacheEntryTypeHttpCaching { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it mean when fetchedWeb3Url.HttpCode == 200
without ETag? Do you have an example of it? If cacheEntryPresent should we replace it instead of remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the website does not returns an ETag, it means it does not wish it to be cached (with the ETag mechanism).
So the scenario would be :
- Browser A request /path , web3protocol-go returns a body with an ETag, web3url-gateway caches it.
- Browser A request /path again, with
If-None-Match
, web3url-gateway forwards the call to web3protocol-go, which returns 304. The 304 is sent back to the browser. - The website is being modified by the author
- Browser A request /path again, with
If-None-Match
, web3url-gateway forwards the call to web3protocol-go, which returns 200 with a new ETag. In this case, we enterif willCacheResponseAsType != "" {
in line 316, and web3url-gateway updates it cache. It then forward the response to the browser. - The website is again being modified by the author
- Browser A request /path again, with
If-None-Match
, web3url-gateway forwards the call to web3protocol-go, and this time the website decides to return a 200 without an ETag (because the website has decided it does not want to cache this updated version, or it cannot safely generate a unique ETag). In this case, we enter line 347} else if fetchedWeb3Url.HttpCode == 200 && cacheEntryPresent && cacheEntry.Type == PageCacheEntryTypeHttpCaching {
and we want to delete the cache, because the website just told he no longer want the page to be cached.
Now, thinking more about it, one thing I should change is : I should clear the cache not only when HTTP code is 200, but also for any 2xx, 4xx, 5xx HTTP code (maybe 3xx too, need to research a bit).
Example : in our last scenario, the website could have been modified by the author to unpublish a page, so now /path returns 404 (and 404 without ETag are likely to be common). So here, we can see we need to clear the web3url-gateway cache.
I will make a change on the HTTP code check a bit later.
Hi!
This PR add support for caching of 2 types.
The aim of caching is to reduce RPC calls to the RPC providers.
The first type of caching is easy : it is a config entry
pageCache.immutableUrlRegexps
in which we declare a list of URLs we know are immutable. So the first time a page is loaded from RPC, then the result is saved in cache, and will be served from cache for later calls.The second type of caching is a partial implementation of standard proxy HTTP caching : If we visualize web3url-gateway being a proxy, and the web3protocol-go library being a remote server for which we proxy : we implement standard HTTP caching based on ETag.
The mechanism is basically :
If-None-Match
cache invalidation header, and we have it in cache, then we inject anIf-None-Match: <ETag stored in cache>
, and we mark that it was manually injectedIf-None-Match
: we return the cached request response and we stop here.So this is basically standard partial HTTP caching. The cache is a LRU cache that can be configured (max nb of entries, max size of entries, TTL).
The more interesting part is inside an update of web3protocol-go, which implements ERC-7774 ( ethereum/ERCs#652 -- a bit of work still needed), which allows resource request mode websites to send cache invalidation events. That way, the web3protocol-go listen for events, and can send HTTP code 200 or 304 (Not modified). The most important part is : as long as the content is not modified, a 304 (Not modified) response will not make a RPC call.
Final conclusion : for an homepage I was working on, it was making 16 eth_call RPC calls.
After I implemented ERC-7774 on web3protocol-go, and the HTTP cache in web3url-gateway, it was reduced to 2 eth_call RPC calls. After I added the immutable URL caching system in web3url-gateway (in which I cache some auto mode URLs), it is now reduced to 0 eth_call RPC calls!
So now web3url-gateway can handle heavy traffic on a
web3://
website implementing ERC-7774.