[Argus] Fix distributor node syncing query#4921
Conversation
kdembler
left a comment
There was a problem hiding this comment.
Tested this and works properly! I've checked that the number of objects is same as the previous version.
You can also see less pressure on resources:

Thanks for your work @zeeshanakram3!
| GetDataObjectsWithBagsByIdsQuery, | ||
| GetDataObjectsWithBagsByIdsQueryVariables | ||
| >(GetDataObjectsWithBagsByIds, { bagIds: bagIdsBatch, limit: bagIdsBatch.length }, 'storageBags')) | ||
| ) |
There was a problem hiding this comment.
What do you think of adding a small delay between batches? Like 0.5s? I'd rather have this operation take slightly longer but not put such a heavy load on server
mnaamani
left a comment
There was a problem hiding this comment.
Great work.
I wouldn't be surprised if the storage-node needs a similar fix.
@kdembler suggested adding a small delay between fetching batches. That is probably a good idea. What might work better is to use generators, yielding one batch at a time, allow consumer/caller to process the batch of objects before coming back for more.
I will merge then bump the version of argus and prepare a docker release (using the #4886 branch)
The storage node already uses pagination at least, and from my testing, it wasn't as bad. It would help the overall system load to bump the |
|
Problem
The distributor node executes getDistributionBucketsWithObjectsByWorkerId query to get all the data objects that given node is supposed to distribute. The problem is that with the consistent growth of the storage directory, the response size of this query is becoming larger and larger. For some time this query was experiencing
Timeout, upon investigating it turned out that while executing this query the graphql-server was consistently crashing withFATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memoryFix
This PR divides the given query into multiple smaller queries so that the graphql-server is successfully able to process it, until we find the reason for the memory leak happening in the graphql-server and create a proper fix.