Skip to content

Image endpoint: Avoid checking updates per request #1151

@jvwong

Description

@jvwong

A reasonable estimate is that 99.9% of the time, submitted pathways / documents are never updated by the author after the fact. A very minor case (handful) is when it is updated, by us during post-curation or the author does do it post-submission.

This suggests that we can prioritize the case where the network image stays the same on submit. Importantly, checking the time of last update on each document on image fetch is unnecessary and introduces a performance burden on each webpage load (home, search).


Originally posted by @jvwong in #1143 (comment)

I'm finding the getDoc call to be a major offender following an initial load:

const main = async () => {
try {
const doc = await getDoc(id);
const lastEditedDate = '' + doc.lastEditedDate();
const cache = imageCache.get(id);
const canUseCache = imageCache.has(id) && cache.lastEditedDate === lastEditedDate;
if( canUseCache ){
res.send(cache.img);
} else {
const cache = await fillCache(doc, lastEditedDate);
res.send(cache.img);
}

Alternative to handle imgCache staleness is:

  • Fetch the JSON results from /api/document/ (ttl is 24 hours)
  • Find the matching document and it's lastEditedDate
  • Hit the cache or regenerate as usual

So the downside is authors won't see updates to their submitted docs within 24 hours. The p-limit code is still valid on initial load, the rest of the time the entire search browse page loads almost instantly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions