You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running text-generation-web-ui, my system just shuts off as if the power goes out. It is instantaneous crash. The log files are empty around time of crash. It is reproducible, basically around one out of ten messages using the text-generation-web-ui API and my system will crash.
I have two 4090s running model with ExLlamav2_HF with max_seq_len 16000 on autosplit mode.
Monitoring nvitop before the crash, I see my GPU power is usually around 200 watts for each during inference but sometimes it jumps to around 424 watts each.
The GPU memory is pretty maxed out at 23/23.99 GiB for GPU 1 and 21.8/23.99 GiB for GPU 2.
My power supply should be able to handle as it is 1600W.
Since my logs are not getting any information, does anyone have any ideas? I was thinking I could run some live monitoring to a log file and maybe catch something at the time of crash that is not showing in system logs. For example I could run a command like this to log nvidia info:
Could anyone recommend some monitoring software to help me narrow down the problem? I'm stumped right now, thinking to try a backup battery just in case it is a power issue.
Is there an existing issue for this?
I have searched the existing issues
Reproduction
Load model in text-generation-web-ui. max_seq_len 16000 on autosplit mode ExLlamav2_HF. One in ten messages will cause a crash.
Describe the bug
When running text-generation-web-ui, my system just shuts off as if the power goes out. It is instantaneous crash. The log files are empty around time of crash. It is reproducible, basically around one out of ten messages using the text-generation-web-ui API and my system will crash.
I have two 4090s running model with ExLlamav2_HF with max_seq_len 16000 on autosplit mode.
Monitoring nvitop before the crash, I see my GPU power is usually around 200 watts for each during inference but sometimes it jumps to around 424 watts each.
The GPU memory is pretty maxed out at 23/23.99 GiB for GPU 1 and 21.8/23.99 GiB for GPU 2.
My power supply should be able to handle as it is 1600W.
Since my logs are not getting any information, does anyone have any ideas? I was thinking I could run some live monitoring to a log file and maybe catch something at the time of crash that is not showing in system logs. For example I could run a command like this to log nvidia info:
sudo nvidia-smi --query-gpu=timestamp,name,pci.bus_id,driver_version,pstate,pcie.link.gen.max,pcie.link.gen.current,temperature.gpu,utilization.gpu,utilization.memory,memory.total,memory.free,memory.used --format=csv -l 1 > nvidia-smi.log
Could anyone recommend some monitoring software to help me narrow down the problem? I'm stumped right now, thinking to try a backup battery just in case it is a power issue.
Is there an existing issue for this?
Reproduction
Load model in text-generation-web-ui. max_seq_len 16000 on autosplit mode ExLlamav2_HF. One in ten messages will cause a crash.
Screenshot
No response
Logs
System Info
The text was updated successfully, but these errors were encountered: