Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

ImageNet Issues on Pascal GPUs #4567

Closed
amithr1 opened this issue Jan 6, 2017 · 4 comments
Closed

ImageNet Issues on Pascal GPUs #4567

amithr1 opened this issue Jan 6, 2017 · 4 comments

Comments

@amithr1
Copy link

amithr1 commented Jan 6, 2017

Hi All,

I was able to compile MXNET on the Pascal GPUs after adding -gencode arch=compute_60,code=compute_60 flags. The system uses cuda 8.0.
I found that when I compile OpenCV with CUDA support turned off, and run ImageNet, I get only 20-25 Images/Sec with two GPUs. I thought that OpenCV was limiting performance so I used OpenCV with CUDA support turned on.
But, when I do that I get seg faults. When I did a back trace, I found that simple functions such as CudaSetDevice() Fail. Attaching the backtrace below. Not sure if it is a bug in OpenCV or MXNET.

#0 0x00003fffb7c9af54 in pthread_mutex_lock () from /lib64/libpthread.so.0
#1 0x00003fff6388d588 in cudbgApiDetach () from /usr/lib/nvidia/libcuda.so.1
#2 0x00003fff638600f8 in cudbgApiDetach () from /usr/lib/nvidia/libcuda.so.1
#3 0x00003fff63886cd0 in cudbgApiDetach () from /usr/lib/nvidia/libcuda.so.1
#4 0x00003fff63972360 in cuVDPAUCtxCreate () from /usr/lib/nvidia/libcuda.so.1
#5 0x00003fff638907c4 in cudbgApiDetach () from /usr/lib/nvidia/libcuda.so.1
#6 0x00003fff638924dc in cudbgApiDetach () from /usr/lib/nvidia/libcuda.so.1
#7 0x00003fff63849368 in cudbgApiDetach () from /usr/lib/nvidia/libcuda.so.1
#8 0x00003fff63744644 in ?? () from /usr/lib/nvidia/libcuda.so.1
#9 0x00003fff638bbd30 in cuInit () from /usr/lib/nvidia/libcuda.so.1
#10 0x00003fff9fdf4b9c in __cudaInitManagedRuntime () from /usr/local/cuda/lib64/libcudart.so.8.0
#11 0x00003fff9fdf7618 in __cudaInitManagedRuntime () from /usr/local/cuda/lib64/libcudart.so.8.0
#12 0x00003fffb7c9fa2c in pthread_once () from /lib64/libpthread.so.0
#13 0x00003fff9fe378c8 in cudaGraphicsVDPAURegisterOutputSurface () from /usr/local/cuda/lib64/libcudart.so.8.0
#14 0x00003fff9fdee9f8 in __cudaInitManagedRuntime () from /usr/local/cuda/lib64/libcudart.so.8.0
#15 0x00003fff9fdf8fa4 in _cudaInitManagedRuntime () from /usr/local/cuda/lib64/libcudart.so.8.0
#16 0x00003fff9fe13760 in cudaSetDevice () from /usr/local/cuda/lib64/libcudart.so.8.0
#17 0x00003fffa22b3924 in mxnet::StorageImpl::ActivateDevice (ctx=...) at src/storage/storage.cc:47
#18 0x00003fffa22b1754 in mxnet::StorageImpl::Alloc (this=0x3fff0c0073d0, size=1204224, ctx=...) at src/storage/storage.cc:95
#19 0x00003fffa14e73c0 in mxnet::NDArray::Chunk::CheckAndAlloc (this=0x111dcea8) at include/mxnet/./ndarray.h:346
#20 0x00003fffa14e731c in mxnet::NDArray::Chunk::Chunk (this=0x111dcea8, size=301056, ctx=..., delay_alloc
=false, dtype=0) at include/mxnet/./ndarray.h:341

@piiswrong
Copy link
Contributor

don't use GPU enabled OpenCV. It doesn't offer speed up as we don't use opencv's gpu features.

@mli
Copy link
Contributor

mli commented Jan 7, 2017

try --test-io option, it will tell you how fast to read the data:

https://github.com/dmlc/mxnet/tree/master/example/image-classification#speed

@amithr1
Copy link
Author

amithr1 commented Jan 10, 2017

Thanks..I tested it again today with this option. Looks like IO is becoming the bottleneck..

@yajiedesign
Copy link
Contributor

This issue is closed due to lack of activity in the last 90 days. Feel free to reopen if this is still an active issue. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants