-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Single release for PaddlePaddle CPU Image #1607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
## Single release for PaddlePaddle CPU Image | ||
|
||
### Background | ||
|
||
Currently, PaddlePaddle supports AVX and SSE3 intrinsics (extensions to the x86 instruction set architecture). When using CMake to compile PaddlePaddle source code, it will check and detect the host which SIMD instruction is supported, then automatically set the legal one. Developer or user also could manually set CMake option `WITH_AVX=ON/OFF` before PaddlePaddle compilation. That's good for local usage. | ||
|
||
|
||
### Problem Involved | ||
|
||
Nonetheless, from the perspective of the deployment, there are some drawbacks: | ||
|
||
1. The online runtime environment is very complex, if an older node does not support AVX or others, | ||
PaddlePaddle will crash and throw out `illegal instruction is used`. This problem will appear | ||
frequently on cluster environment, like Kubernetes. **It must be addressed before PaddlePaddle on Cloud** | ||
|
||
2. Once new version is ready to deliver, we have to release more products to users, for example, `no-avx-cpu`, `avx-cpu`, `no-avx-gpu`, `avx-gpu`. Users do not need to care about details. It sucks! | ||
|
||
|
||
### How to Address it? | ||
|
||
To address this issue, there are three primary components: | ||
|
||
1. [Done] Runtime Check: | ||
|
||
We can utilize CPU ID information to check SIMD info at runtime. This functionality already merged into | ||
current develop branch. For full details, please check out [CpuId.cpp](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/utils/CpuId.cpp) and [CpuId.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/utils/CpuId.h). | ||
|
||
|
||
2. [Pending] Adjust `cuda` Directory. | ||
|
||
Since the current `cuda` directory includes heterogeneous source code (cpu and gpu), we want to refactor `cuda` directory. For simplicity, different simd intrinsics will be inside the different directories. we need to | ||
modified CMake files to support this solution. | ||
|
||
3. [Pending] Modify CMake files. | ||
|
||
Different simd intrinsics will be inside the different directories. we need to modified CMake files to support this solution. Each directory uses the different compile options (`-mavx` or `-msse`) to generate the corresponding binaries. Then, at runtime, using SIMD flags `HAS_AVX`, `HAS_SSE` automatically detect and select the supported branch (intrinsics) to execute. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这部分逻辑需要这么做再看一下,简单的做法#1634 (comment) 。 |
||
|
||
|
||
### Conclusion | ||
|
||
The method could fix the releases and deployment problems. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 所以,releases and deployment 时的环境不一致,从而带来的运行时的一些困惑是这个design的目的。而 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuda目录里面的代码需要调整,但是
different simd intrinsics will be inside the different directories
会怎加一些sse/avx目录,这样感觉并不是很好,每个目录里面可能没有几个文件;另外,我觉得相同功能的代码放在一起比相同指令集的代码放在一起更重要。There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有一些开源库的做法是,把一些simd intrinsic做一层封装[fftw],上层的功能都是基于这层封装开发的,毕竟大部分用intrinsic实现的功能都只是指令的不一样,而常用的指令也就是
load、store、add、mul
等