-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Malicious Stream Causing GPU Crashes and Node Disruptions #2970
Comments
@FranckUltima I'd assume for now that it's a problem with the stream rather than something malicious but let's try to get to the bottom of it as soon as w e can. Please update the ticket once you have more details |
Here's logs for a stream that crashed both transcoder processes behind my orchestrator Transcoder
Orchestrator
|
Scrolling back through my logs, I found this on my NYC combined OT (running in docker) that caused the container to restart
|
Here are mine from some time ago:
|
USA again:
|
|
I lost 8 streams in EU region when transcoder hit an error. It looks like other orchestrators were affected within a similar timeframe. Some of the data in the logs looks suspicious. The transcoder was at 31% utilization and transcode times were nominal so it should not have run out of memory. Time of impact: 15:33 GMT-0 June 13, 2024 Attached are logs from both orchestrator and transcoder. Transcoder logs show what appears to be malicious input going to
The problematic manifest appears to be: Public broadcaster logs for manifest
It is also notable that these issues occurred during the ramp up of a gas price spike |
Several orchestrators have noticed, in recent weeks, the existence of one or more streams causing GPU crashes (cuda error) which leads to the loss of all running streams, and even to the crash of the transcoder/node. The stream then switches to another orchestrator, causing the same results.
I am concerned about the impact of this kind of event on the quality of the network, and the possibility that such streams could also be voluntarily injected into the network.
Following the discussion yesterday evening on the watercooler chat, Doug considered the possibility that the Broadcaster node could identify the stream and delete it automatically to prevent it from spreading, but if the stream was injected voluntarily, it might be better to be able to detect it and block it at the orchestrator level.
I have not been able to identify the responsible stream, and this event remains rather rare, but when it happens, several orchestrators usually report it on Discord.
The text was updated successfully, but these errors were encountered: