Skip to content
This repository has been archived by the owner on Feb 14, 2023. It is now read-only.

Fix redis client handle leak #79

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

npny
Copy link

@npny npny commented Dec 17, 2020

Continued from initial reproduction at #77 (comment)

Okay so to recap:

  • If you let it run long enough, you observe the saveApiResponses.mjs script crashing with ReplyError: Ready check failed: ERR max number of clients reached
  • This is coming from the redis server, which is telling us that we have too many concurrent connections open at the same time
  • We can verify this by logging into the redis server with redis-cli then asking for CLIENT LIST, which lists all the open ones. redis-cli CLIENT LIST | wc -l is the one-liner to count them
  • As yarn fetch keeps running and time passes, we can watch this list grows in real time and the number never decreases
  • Once we reach redis-cli CONFIG GET MAXCLIENTS (which is at 10000 here), we can't open any new connection and the script goes kaboom
  • Even if we're not running yarn develop (the side that reads off the data stored in redis for display), the number is growing, so we know the culprit is yarn fetch (the side that puts the data into redis)
  • Furthermore, killing yarn fetch instantly resets the open connections count to zero, so it seems to be yarn fetch abnormally holding onto its end of the open connections, not redis itself.
  • After commenting out increasingly narrow sections of saveApiResponses.mjs we find we can make the problem disappear by just removing lines 47-53:
  const nodeAccountsWithLocation = await Promise.all(nodeAccounts.map(async (nodeAccount) => {
    const cacheKey = `nodeIP-${nodeAccount['ip_address']}`;
    const lookup = await withCache(cacheKey, async () => {
      return await lookupGeoIP(nodeAccount['ip_address']);
    });
    return { ...nodeAccount, location: lookup };
  }));
  • On further examination the problem is with withCache.
  export async function withCache(cacheKey, func, { expire } = {}) {
  const client = redis.createClient();
  const getAsync = promisify(client.get).bind(client);
  const setAsync = promisify(client.set).bind(client);

  const cached = await getAsync(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }

  const val = await func();

  await setAsync(cacheKey, JSON.stringify(val));
  if (expire) {
    client.expire(cacheKey, 86400);
  }
  client.unref();

  return val;
};
  • withCache is calling redis.createClient every time it checks for a key, and then never closes it with client.quit
  • We have about 60 nodeAccounts we try to lookup, which explains why every [chaosnet]: starting data fetch... adds another 60 open clients on redis
  • The solution was to perform the caching check manually in the body of the loop, where our getAsync/setAsync are now able to reuse the global client redis instance:
   let lookup = await getAsync(cacheKey);
    if (!lookup) {
      lookup = await lookupGeoIP(nodeAccount['ip_address']);
      setAsync(cacheKey, JSON.stringify(lookup))
    }
  • The open connection count indeed hovers stably at 1 after performing this change

Thus to confirm the change:

  • Launch yarn fetch
  • Run redis-cli CLIENT LIST | wc -l repeatedly to see connections pile up
  • Checkout this branch
  • Relaunch yarn fetch
  • Run redis-cli CLIENT LIST | wc -l and see connection count remain at one

Note that this was very likely the problem #77 (comment) was trying to mitigate, making it not so useful anymore.

@npny npny requested a review from lukesaunders December 17, 2020 21:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant