-
Notifications
You must be signed in to change notification settings - Fork 13.1k
Description
Description:
Hi,
From time to time we have observing really serious Rocket.Chat stability issues. Duo to network problem, about half of our 3000 connected users are disconnected and connected again to Rocket.Chat. This situation caused under 100% processors load on Rocket docker containers and unavailability (messages greyed out, client disconnects). Also high containers RAM and Garbage Collector usage (screens below). It was probably caused by presence broadcast storm, because when we switched on "Disable Presence Broadcast", after couple of minutes everything back to normal. Switch off it again caused above problems and unavailability.
Permanent switch on "Disable Presence Broadcast" is not an option because user presence is not correct.
Steps to reproduce:
- Server with about 3000 users online
- In example restart reverse proxy (HAproxy, Nginx...)
- Users are disconnected and connected again
- Rocket.Chat is unavailable due to high load
Expected behavior:
Disconnect and connect again a lot of users same time not cause Rocket unavailable
Actual behavior:
Rocket is unavailable
Server Setup Information:
- Version of Rocket.Chat Server: 3.9.7
- Operating System: Centos7
- Deployment Method: docker-compose
- Number of Running Instances: 33
- DB Replicaset Oplog: YES
- NodeJS Version: v12.18.4
- MongoDB Version: 4.0.14
Client Setup Information
- Desktop App or Browser Version: All
- Operating System: All
Additional context
Heavy rocketchat.users collection updates



