Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQUEST] Add GPU usage #26

Closed
rsnk96 opened this issue Apr 25, 2020 · 11 comments
Closed

[REQUEST] Add GPU usage #26

rsnk96 opened this issue Apr 25, 2020 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@rsnk96
Copy link

rsnk96 commented Apr 25, 2020

Is your feature request related to a problem? Please describe.
GPU Memory and Core is currently not displayed

Describe the solution you'd like
Something that either shows the recent history of GPU usage, or current GPU usage

Describe alternatives you've considered
Currently using https://github.com/Syllo/nvtop, the same could be looked at being integrated

Additional context
Awesome project!

@rsnk96 rsnk96 added the enhancement New feature or request label Apr 25, 2020
@ypwhs
Copy link

ypwhs commented Apr 25, 2020

I am using this command now: https://github.com/wookayin/gpustat

@aristocratos
Copy link
Owner

I looked in to it when I started writing bashtop but ran in to a lot of problems with different tools for different gpus, permissions and overall cpu usage.
Gonna take a look at it again when I'm done with bashtop-psutil branch, but not gonna promise anything.

@Baschdl
Copy link

Baschdl commented Apr 26, 2020

Did you look into gpustat? It should give you the necessary information for all relevant nvidia gpus.

from gpustat.core import GPUStatCollection
gpu_stats = GPUStatCollection.new_query().jsonify()

Returned information for each gpu (https://github.com/wookayin/gpustat/blob/5c8898e43326e5c7b1450503e6dd9dc3ea4967d0/gpustat/core.py#L481):

gpu_info = {
                'index': index,
                'uuid': uuid,
                'name': name,
                'temperature.gpu': temperature,
                'fan.speed': fan_speed,
                'utilization.gpu': utilization.gpu if utilization else None,
                'utilization.enc':
                    utilization_enc[0] if utilization_enc else None,
                'utilization.dec':
                    utilization_dec[0] if utilization_dec else None,
                'power.draw': power // 1000 if power is not None else None,
                'enforced.power.limit': power_limit // 1000
                if power_limit is not None else None,
                # Convert bytes into MBytes
                'memory.used': memory.used // MB if memory else None,
                'memory.total': memory.total // MB if memory else None,
                'processes': processes,
            }

@Baschdl
Copy link

Baschdl commented Apr 26, 2020

amdgpu-utils is a similar library for amd gpus but needs quite a lot of setup:

In order to use any of these utilities, you must have the amdgpu open source driver package installed. You also must first set your Linux machine to boot with amdgpu.ppfeaturemask=0xffff7fff or 0xfffd7fff. This can be accomplished by adding amdgpu.ppfeaturemask=0xffff7fff to the GRUB_CMDLINE_LINUX_DEFAULT value in /etc/default/grub and executing sudo update-grub

@gregk-git
Copy link

gregk-git commented Apr 29, 2020

Why not support Nvidia GPUs first, and worry about AMD GPUs later, it should be easier considering the Nvidia tool already works with minimal configuration from the user @aristocratos

@aristocratos
Copy link
Owner

Looks like polling with python modules is the easiest way to go, regardless of nvidia/amd. It would still be up to the user to correctly set up any dependent tool.
Could possibly set it up so the CPU box is split so a third of it becomes a GPU box if required tools is present.

It will however be a while before I get to it. A lot of stuff on the TODO list and mostly just weekends to work on it.

@BullShark
Copy link

BullShark commented May 20, 2020

nvidia-smi was included with the nvidia driver on my system, so that's one less dependency if the nvidia driver is already installed. The temperature can be obtained from a one-liner in bash.

$ nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader,nounits
43

$ pacman -Qo $(which nvidia-smi)
/usr/bin/nvidia-smi is owned by nvidia-440xx-utils 440.82-1

From what I read, the amd gpu temp can be obtained from lm_sensors using the sensors command.

@adojck
Copy link

adojck commented May 27, 2020

Currently to monitor my GPU Temps, Utilization and other stats I use nvtop

I think nvtop uses nvidia-smi like @BullShark wrote. Maybe it could be used in bashtop to implement GPU usage bars?

@aristocratos
Copy link
Owner

I'm leaning towards @Baschdl recommendation with gpustat python module. I'm currently working on a new version running python on a secondary thread which would pull ALL system information in to the bash script through pipes and never have to start any forks. Will hopefully be a big reduction in cpu usage overall.

@bodograumann
Copy link

For intel gpus there is intel_gpu_top from intel-gpu-tools. I would love to have this information integrated into bashtop.

@aristocratos
Copy link
Owner

aristocratos commented Jan 15, 2021

@bodograumann
Bashtop isn't actively being worked on anymore, see https://github.com/aristocratos/bpytop for the currently active project.

But will answer this for the current effort of adding gpu stats for bpytop (and the remote possibility of backporting it to bashtop):

I don't know how useful intel_gpu_top would be, can see a couple of problems.

intel_gpu_top is a tool to display usage information of an Intel GPU. It requires root privilege to map the graphics device.
Note that idle units are not displayed, so an entirely idle GPU will only display the ring status and header.

  1. Would require bpytop to be run as root to work at all.
  2. Needs separate threading since it needs time to sample the system to get any usage stats. Can be costly in terms of CPU usage.
  3. No output displayed while idle, can possibly be worked around by caching the values but not sure if this just applies to systems that switch between discrete graphics and built-in?
  4. Can't actually find any information about what statistics it actually outputs, needs to be consistent between different gpus and need to have sample outputs from a couple to know how to parse it.
  5. Would also need some testing from people of how resource heavy intel_gpu_topin itself is, since it needs to be called at every update.

@rsnk96 rsnk96 closed this as completed Jan 16, 2021
@aristocratos aristocratos unpinned this issue Nov 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants