During handling of the above exception,another exception occured #87

sdlmw · 2017-08-29T06:24:39Z

Does anyone know what is the cause of this?
My system environment：Ubuntu_16.04_X86_64; nvidia dirver version:v384.66 nvcc_version:v8.0 cuDNN:6.0 .
I dont know why.
Thank you before

ljch2018 · 2017-08-31T05:32:42Z

OOM when allocating tensor with shape...

OOM: Out of memory which means your system doesn't have enough memory to run tensorflow.

sdlmw · 2017-08-31T05:54:18Z

@luosmart ok，thank you. But I'm running with 8G of RAM maximum use 2G.And then it ends. I processed the file 15M。Whether the TensorFlow itself causes an exception to be thrown

ljch2018 · 2017-09-01T05:19:20Z

@sdlmw Please check how much GPU memory left in your system, maybe other programs use too much GPU memory, try this command:

nvidia-smi

sdlmw · 2017-09-01T05:23:33Z

@luosmart ，yeah. I have checked.In addition to system use,All remaining for TensorFlow use，about 1940MB

ljch2018 · 2017-09-01T05:29:09Z

@sdlmw shape[12360, 17191] width and length of tensor is too large ? Have you checked the format of input data ?

sdlmw · 2017-09-01T05:32:23Z

@luosmart 。yeah，this is the file size.These two files were 12360kb and 17191kb.

ljch2018 · 2017-09-01T05:35:26Z

@sdlmw 12360 * 17191 = 212480760 which is too much for GPU, why not use batch mechanism.

sdlmw · 2017-09-01T05:39:09Z

@luosmart are you chinese? i am sorry . I didn't touch beatch mechanism.

sdlmw · 2017-09-01T05:42:03Z

@luosmart ，真的很抱歉，我是刚刚开始接触linux，然后派了这样的工作。很多东西都不了解。既然这样。那我就清楚了。我犯得的问题是没有仔细看看 error 输出。

ljch2018 · 2017-09-01T05:45:41Z

@sdlmw 检查你读取的数据，你输入的数据是一个超级大的矩阵，一定会OOM的。另外，如果有处理大的输入，可以分批量来输入，构造成小的tensor，这样吐给模型就不会OOM了。

sdlmw · 2017-09-01T05:49:19Z

@luosmart 是的，从昨天我发现问题所在，然后使用了美亚云的一个比较强劲的gpu就可以了。通过你解释的矩阵，那么我清楚了现在的工作内容。再次感谢你，谢谢了

sdlmw · 2017-09-01T05:50:15Z

@luosmart
但同时还有个以为，可不可以把使用gpu的内存切换到运存内呢？

ljch2018 · 2017-09-01T05:51:47Z

@sdlmw

可不可以把使用gpu的内存切换到运存内呢？

你是指把GPU的内存切换成系统的内存吗？这是不行的，因为GPU和系统的内存是两块不同的硬件，不能通用的。

sdlmw · 2017-09-01T06:01:27Z

@luosmart 好的谢谢。那是否tensorflow提供了其他的方案可以解决像是千万级别数据的一次性处理呢。。如果没有那可能处理的结果并不能达到我们想要的要求。甚至批量处理会浪费大量的时间

ljch2018 · 2017-09-01T06:04:35Z

@sdlmw 分批量处理（batch）是所有深度学习模型的通用的处理方法，完全可以处理千万级别的数据，这是毫无疑问可以做到的。

甚至批量处理会浪费大量的时间

正确的分批量处理完全不会浪费大量时间，甚至和全量处理差不多。

sdlmw · 2017-09-01T06:06:53Z

@luosmart 好的，谢谢。我会抽时间来学习下深度学习，现在太白了。非常感谢你

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

During handling of the above exception,another exception occured #87

During handling of the above exception,another exception occured #87

sdlmw commented Aug 29, 2017

ljch2018 commented Aug 31, 2017 •

edited

Loading

sdlmw commented Aug 31, 2017

ljch2018 commented Sep 1, 2017 •

edited

Loading

sdlmw commented Sep 1, 2017

ljch2018 commented Sep 1, 2017 •

edited

Loading

sdlmw commented Sep 1, 2017

ljch2018 commented Sep 1, 2017 •

edited

Loading

sdlmw commented Sep 1, 2017

sdlmw commented Sep 1, 2017

ljch2018 commented Sep 1, 2017

sdlmw commented Sep 1, 2017

sdlmw commented Sep 1, 2017

ljch2018 commented Sep 1, 2017 •

edited

Loading

sdlmw commented Sep 1, 2017

ljch2018 commented Sep 1, 2017 •

edited

Loading

sdlmw commented Sep 1, 2017

During handling of the above exception,another exception occured #87

During handling of the above exception,another exception occured #87

Comments

sdlmw commented Aug 29, 2017

ljch2018 commented Aug 31, 2017 • edited Loading

sdlmw commented Aug 31, 2017

ljch2018 commented Sep 1, 2017 • edited Loading

sdlmw commented Sep 1, 2017

ljch2018 commented Sep 1, 2017 • edited Loading

sdlmw commented Sep 1, 2017

ljch2018 commented Sep 1, 2017 • edited Loading

sdlmw commented Sep 1, 2017

sdlmw commented Sep 1, 2017

ljch2018 commented Sep 1, 2017

sdlmw commented Sep 1, 2017

sdlmw commented Sep 1, 2017

ljch2018 commented Sep 1, 2017 • edited Loading

sdlmw commented Sep 1, 2017

ljch2018 commented Sep 1, 2017 • edited Loading

sdlmw commented Sep 1, 2017

ljch2018 commented Aug 31, 2017 •

edited

Loading

ljch2018 commented Sep 1, 2017 •

edited

Loading

ljch2018 commented Sep 1, 2017 •

edited

Loading

ljch2018 commented Sep 1, 2017 •

edited

Loading

ljch2018 commented Sep 1, 2017 •

edited

Loading

ljch2018 commented Sep 1, 2017 •

edited

Loading