We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
大佬可以解释一下线程池加速推理的原理吗?这个方法对所有模型都适用吗?如果一个现有transformer模型单帧推理需要400ms,使用线程池对加速会有帮助吗?感谢!
The text was updated successfully, but these errors were encountered:
NPU支持并发/并行推理,使用线程池去进行并发推理可以提高NPU的资源利用率(大多数情况下NPU占用都是处于一个不高的状态),进而达到"加速"的效果, 理论上来说对于所有模型都是适用的,加速的效果可以查看NPU当前的使用率去倒推,受限于CPU性能,达到边际效应后提升的效果反而会不明显
Sorry, something went wrong.
No branches or pull requests
大佬可以解释一下线程池加速推理的原理吗?这个方法对所有模型都适用吗?如果一个现有transformer模型单帧推理需要400ms,使用线程池对加速会有帮助吗?感谢!
The text was updated successfully, but these errors were encountered: