Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sudo: scalene: command not found with run as Admin #700

Open
sdbhosale opened this issue Sep 19, 2023 · 11 comments
Open

sudo: scalene: command not found with run as Admin #700

sdbhosale opened this issue Sep 19, 2023 · 11 comments

Comments

@sdbhosale
Copy link

I recently started getting this error. When I run my script using scalene main.py I get an error NOTE: The GPU is currently running in a mode that can reduce Scalene's accuracy when reporting GPU utilization. Run once as Administrator or root (i.e., prefixed with sudo) to enable per-process GPU accounting

and when I run sudo scalene main.py it errors sudo: scalene: command not found

  • OS: [Ubuntu 22.04
  • Browser: Chrome 116.0.5845.179

I did try installing using (python3 -m pip install git+https://github.com/plasma-umass/scalene). It did not work

@emeryberger
Copy link
Member

Can you try sudo python3 -m scalene main.py?

@emeryberger
Copy link
Member

@sdbhosale
Copy link
Author

Thanks I will look into it. Meanwhile I was able to get past my previous error by running sudo /home/suraj/anaconda3/envs/proj1/bin/scalene main.py but ended in Scalene error: received signal SIGSEGV
I read through #110 but no luck

@emeryberger
Copy link
Member

Can you try invoking Scalene with the --cpu and --gpu flags to see what happens? Thanks!

@sdbhosale
Copy link
Author

sdbhosale commented Sep 20, 2023

I am able to run scalene main.py when I am not connected to hardware (cameras). I am not sure if that would cause any issue. However, further looking at the output, I was not able to profile the code inside the target function of the child process. I am using python Multiprocessing module, and using Pipe to communicate between processes. Could you please provide any insight on my issue?

@chadbrewbaker
Copy link

Similar issue. When ran as sudo it isn't picking up the Miniconda environment variables. No problems re-compiling a cypython for root that is same version as conda and perhaps with any frame pointers whatnot intact, but somehow telling python3 where to find the conda packages seems to be the main issue. https://github.com/python/cpython/tags

@emeryberger
Copy link
Member

Have you tried using a virtual environment (venv)?

@chadbrewbaker
Copy link

chadbrewbaker commented Dec 21, 2023

Unfortunately the researchers who wrote the machine learning model did all their package management in conda.

I could do something like this, but NVIDIA's documentation is horrid on GPU stack traces for flamegraphs.

some_nvidia_profiler scalene  program.py

Strace is opaque, I can get the timestamps of GPU IO, but the GPU itself is a black box.

https://poormansprofiler.org trick with cuda-dbg might work?

--EDIT--
Numba has a GPU simulator, https://numba.pydata.org/numba-doc/dev/cuda/simulator.html

I need to know which tensor kernels are being selected by PyTorch and observe GPU memory that can be pinned to avoid IO between GPU calls. You can use zstd as a Kolmogrov estimator for difference between memory GPU memory dumps - it would even pick up large all-zero regions which don't need to go over the bus.

@hukz18
Copy link

hukz18 commented Aug 1, 2024

Is there any update on this issue? It still exists and I'd like to know about the outcomes of not using sudo to do the profiling. Always getting the warning but can't solve it is upsetting.

@emeryberger
Copy link
Member

emeryberger commented Aug 2, 2024

Please try this - install from the repo and then run the scalene.set_nvidia_gpu_modes script:

python3 -m pip install git+https://github.com/plasma-umass/scalene
python3 -m scalene.set_nvidia_gpu_modes

@hukz18
Copy link

hukz18 commented Aug 3, 2024

Hi, thank you for the quick response! The script solves the problem and improves GPU inference time when using scalene greatly (about 3x)👍.

For reference, since I'm using a conda environment my actual procedure is

conda activate <env_name>
pip uninstall scalene # remove the old installation
python3 -m pip install git+https://github.com/plasma-umass/scalene
sudo <path_to_the_conda_python_executable> -m scalene.set_nvdia_gpu_modes

And the script works well for me :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants