Add controls for max. number of images to generate in parallel #1143

maxkfranz · 2023-03-14T15:32:23Z

Use DOCUMENT_IMAGE_PLL_LIMIT to set the number of images that can be generated at once.
- It should be as high as the hardware can support to keep things fast.
- It's set at 1 by default. This always works, but it's not as fast as it could/should be.
Also add DOCUMENT_IMAGE_CACHE_SIZE to the readme. This was undocumented.

@jvwong, would you test this out? We'll also need to determine what number to use for DOCUMENT_IMAGE_PLL_LIMIT in the docker config for the production app.

- Use `DOCUMENT_IMAGE_PLL_LIMIT` to set the number of images that can be generated at once. - It should be as high as the hardware can support to keep things fast. - It's set at 1 by default. This always works, but it's not as fast as it could/should be. - Also add `DOCUMENT_IMAGE_CACHE_SIZE` to the readme. This was undocumented.

jvwong · 2023-03-16T20:47:50Z

@maxkfranz
I've been playing around with this locally. I think we'll need to chat about how I should test the performance, right now I'm not seeing a difference in looking at

Total load time for the Search page (i.e. browsable page after clicking Search)
Load time for the Explorer page for an individual document while the Search page is loading

maxkfranz · 2023-03-17T14:50:50Z

Talk next Wed.?

jvwong · 2023-03-22T17:51:05Z

A couple of instances to test:

Provider	URL	OS	CPU (Cores)	RAM (GB)	Live
Donnelly	*https://test.biofactoid.org/	Ubuntu 16.04.7 LTS	Intel(R) Xeon(R) CPU E5-2697A v4 @ 2.60GHz (4)	30	YES
Digital Ocean	https://beta.biofactoid.org/	Ubuntu 20.04.2 LTS	DO-Regular (4)	7	NO

Branch test will be automatically run here

jvwong · 2023-03-22T19:45:21Z

I'm finding the getDoc call to be a major offender following an initial load:

factoid/src/server/routes/api/document/index.js

Lines 1453 to 1466 in eca68a1

    
           const main = async () => { 
        
             try { 
        
               const doc = await getDoc(id); 
        
               const lastEditedDate = '' + doc.lastEditedDate(); 
        
               const cache = imageCache.get(id); 
        
               const canUseCache = imageCache.has(id) && cache.lastEditedDate === lastEditedDate; 
        
               if( canUseCache ){ 
        
                 res.send(cache.img); 
        
               } else { 
        
                 const cache = await fillCache(doc, lastEditedDate); 
        
                 res.send(cache.img); 
        
               }

Alternative to handle imgCache staleness is:

Fetch the JSON results from /api/document/ (ttl is 24 hours)
Find the matching document and it's lastEditedDate
Hit the cache or regenerate as usual

So the downside is authors won't see updates to their submitted docs within 24 hours. The p-limit code is still valid on initial load, the rest of the time the entire search browse page loads almost instantly.

jvwong · 2023-03-23T15:27:20Z

There's two major cases I've observed.

(1) Empty caches: Initial server load or reboot
Here, this PR (default config vars) does limit the image system and gives the server a chance to respond to other requests (e.g. loading a page, creating a new doc) -- eventually. This is a resource hungry period, but thankfully only happens once.

(2) Cache filled: Each subsequent homepage/search request
The major bottleneck in image fetching is checking the document last update date, which incurs database hits that gum up the server: Each image in the search, for each user loading it in their browser.

For (1) I say let's merge this in as is. For (2) I've started #1151

# Conflicts: # src/server/routes/api/document/index.js

maxkfranz requested a review from jvwong March 14, 2023 15:32

jvwong mentioned this pull request Mar 23, 2023

Image endpoint: Avoid checking updates per request #1151

Closed

jvwong approved these changes Mar 23, 2023

View reviewed changes

jvwong mentioned this pull request Mar 23, 2023

Image cache: Don't load doc #1152

Merged

Merge branch 'unstable' into imgspd

f2fb506

# Conflicts: # src/server/routes/api/document/index.js

jvwong merged commit 57cd4b1 into unstable Apr 3, 2023

jvwong deleted the imgspd branch April 3, 2023 16:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add controls for max. number of images to generate in parallel #1143

Add controls for max. number of images to generate in parallel #1143

Uh oh!

maxkfranz commented Mar 14, 2023

Uh oh!

jvwong commented Mar 16, 2023

Uh oh!

maxkfranz commented Mar 17, 2023

Uh oh!

jvwong commented Mar 22, 2023 •

edited

Loading

Uh oh!

jvwong commented Mar 22, 2023

Uh oh!

jvwong commented Mar 23, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Add controls for max. number of images to generate in parallel #1143

Add controls for max. number of images to generate in parallel #1143

Uh oh!

Conversation

maxkfranz commented Mar 14, 2023

Uh oh!

jvwong commented Mar 16, 2023

Uh oh!

maxkfranz commented Mar 17, 2023

Uh oh!

jvwong commented Mar 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jvwong commented Mar 22, 2023

Uh oh!

jvwong commented Mar 23, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jvwong commented Mar 22, 2023 •

edited

Loading