Description
Initially, we made a global user cache because they exist only once in Discord, however, that has brought many caching issues that eventually required fixes that damage the overall performance, see #6782.
With the release of intents, we can really limit the amount of GuildMember
instances we have, and with it, also the amount of User
instances.
There's also a huge memory leak that has been there since caching has been added to the library: if a bot leaves a guild, all the users that shared only said guild with the bot would sit in memory, forever, and eventually become stale or invalid were we to read their properties when they update something while we don't share a guild with them.
To mitigate this, #6013 added a cache system that allowed us to limit how many entries we store in our CachedManagers, and later #6825 improved the sweeping performance. However, by limiting or sweeping User
s in client.users.cache
, you're only removing one of the many references, in fact, they stay alive at the GuildMember
instance:
discord.js/packages/discord.js/src/structures/GuildMember.js
Lines 60 to 67 in 7959a68
User
instance, completely independent of the first one.
In fact, because GuildMember
has a transient relationship with User
, some funny things happen. For example when discord.js fetches guild members and the developer decides to specify cache: false
, the GuildMember
s are not cached but their respective User
s are instantiated and cached regardless (see code above, thanks @Jiralite).
We could just make GuildMember#user
a getter so we don't duplicate User
s, but, not only that would worsen the developer experience (since it becomes nullable), but it also doesn't solve the memory leak at all.
Alternatively, we could also sweep all entries from client.users.cache
when the bot leaves a guild, but for large internal sharded bots, that's going to be a huge performance hit, and definitely not something we should do.
And last but not least, what's the use case for client.users.cache
to even exist? Most users use client.users.fetch(UserResolvable)
but very often:
- Leave
cache
default, which means it'll always cache (as it defaults totrue
). This has a pitfall, if the user doesn't share a guild, it'll eventually be stale and incorrect, plus it'll be contributing to the memory leak. - Have to use
force
due to the aforementioned issue. If you cache a user that doesn't share a guild with the bot (e.g. for awhois
command), and they change their avatar, the library will, by default, retrieve the entry from the cache, which has a stale avatar, and as such, the command will fail to display correctly. Usingforce
disregards the cache, and ends up making the cache useless again.
A solution
Starting with #6013, Discord.js has a manager called DataManager
, it is similar to CachedManager
, but doesn't have a cache. We can use this so we're still able to do client.users.resolve[Id](UserResolvable)
. We may also move client.user
to client.users.me
or client.users.self
, so we don't have two different properties.
With this approach, we'll stop caching User
s globally. Now we need to store them somewhere: inside GuildMember
. This approach makes sense, because if the guild member leaves a guild, both the GuildMember
and the User
instance will be disposed from memory. Memory leak solved!
And because data duplication may be controversial (and some people may object to this proposal because of it), let me sum up some points:
- Sweeping users can and will result on data duplication due to hard references inside
GuildMember
and new ones not finding the previously createdUser
instances. - The issue from feat(GuildMemberUpdate): emit on user updates #6782 exists, and the approach it takes duplicates
User
instances anyways. - As @Jiralite pointed out, if you specify
cache: false
inGuildMemberManager#fetch
, members won't be cached, but users will.
There may be possible problems with the removal of the global cache, specially for guild data, since we won't be able to get the user data if the member leaves the guild. One place that would be very notable would be Message#author
, however, it turns out that property holds a hard reference just like GuildMember#user
does.