fea/init tensorrt engine #10003

Superjomn · 2018-04-18T02:30:04Z

fixes: #10004

…_tensorrt

dependency has been installed in docker image.

…sorrt_engine

Xreki

建议issue里面对engine的设计初衷描述的详细些。因为没有设计文档，很难判断所有的接口是不是都是必须的。

Xreki · 2018-04-18T12:07:17Z

paddle/fluid/inference/engine.h

+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License. */


Copyright的格式调整一下吧。另外新加入的文件copyright里面的年份应该是2018年。

Xreki · 2018-04-18T12:10:08Z

paddle/fluid/inference/engine.h

+#include "paddle/fluid/framework/framework.pb.h"
+
+namespace paddle {
+


按照Paddle对于命名空间的使用规则，这里应该还有一层namespace inference。
另外，有个问题确认一下，TensorRT实现相关的所有代码，包括tensorrt_op，都放置在inference目录下吗？

可能会有下面的目录

inference/engine.h, engine 宏观接口

inference/tensorrt 装tensorrt 相关的内容

engine_op[.h/.cc], 包含 TensorrtEngineOp

convert[.h/.cc] 帮convert fluid op -> tensorrt layer

inference/alajin 装 alajin的
类似 tensorrt的

Xreki · 2018-04-18T12:22:23Z

paddle/fluid/inference/engine.h

+  virtual void Build(const PbType& paddle_model) = 0;
+
+  // Execute the engine, that will run the inference network.
+  virtual void Execute(int batch_size) = 0;


Execute函数是不是最好以Paddle的LoDTensor类型作为参数？

这个 engine 只是个工具类，会由 TensorrtEngineOp 调用。

我的理解， TensorrtEngineOp 应该会把 engine 封装成 fluid op

engine 的 input 和 output 以及个数都不固定，有 DeclInput 和 DeclOutput 两个分别来创建 tensorrt 的 Input 和 output 节点。

这个类会暴露出比较多的小接口，这些接口会帮助构建 tensorrt 的network 以及 runtime engine，中间小接口基本是不可少的。最重要的用处是在 TensorrtEngineOp 里，额外的会在各种 UT 里使用，比如 fluid_op 与convert后的 tensorrt layer 之间做无diff，用这个类可以帮助跑 tensorrt layer和取结果。

DeclInput 和 DeclOutput 两个分别来创建 tensorrt 的 Input 和 output 节点

这两个和Convert类里的ConvertInput和ConvertOutput里，除了转的那步，功能很类似。有更好的设计办法么？

Decl这四个字母含义不够清晰，能否就叫Add呢？

会改成 DeclareInput，表示在 TensorRT network中添加 data 节点

Superjomn · 2018-04-19T02:26:22Z

这个engine是工具类，会尽快加一个宏观的design，但这个类应该不需要涉及到接口设计，只会在 TensorrtEngineOp 这个层次加接口设计。 @Xreki

…o fea/tensorrt_engine

luotao1 · 2018-04-19T10:52:34Z

paddle/fluid/inference/engine.h

+
+/*
+ * EngineBase is the base class of all inference engines. An inference engine
+ * takes a paddle program as input, and output the result in paddle Tensor


output->outputs

paddle Tensor format->fluid tensor format

luotao1 · 2018-04-19T10:53:40Z

paddle/fluid/inference/engine.h

+ * EngineBase is the base class of all inference engines. An inference engine
+ * takes a paddle program as input, and output the result in paddle Tensor
+ * format. It can be used to optimize performance of computation subgraphs, for
+ * example, break down the original model into subgraphs and execute each


使用block概念是不是更为统一，下同

original model：original block?

subgraphs->sub-block？

luotao1 · 2018-04-19T10:54:38Z

paddle/fluid/inference/engine.h

+ *   When inference, the resnet50 model can put most of the model into subgraph
+ * and run it on a TensorRT engine.
+ *
+ * There are several engines such as TensorRT and other internal frameworks, so


other internal frameworks，可以去掉internal

luotao1 · 2018-04-19T10:55:04Z

paddle/fluid/inference/engine.h

+ */
+class EngineBase {
+ public:
+  // TODO fix it latter


请问39行是要fix什么呢？能否写的详细一点
PbType是指什么？

这里 PbType 表示 desc 的格式，会换一个明确的名字

luotao1 · 2018-04-19T11:01:26Z

paddle/fluid/inference/tensorrt/engine.h

+// them, and an macro like this is more extensible when underlying TensorRT
+// library add new layer supports.
+#define TRT_ENGINE_ADD_LAYER(engine__, layer__, ARGS...) \
+  engine__->network()->add##layer__(ARGS);


请问这个宏定义可以去掉么?

直接用原来的函数也很清晰；

因为convert类里面也需要add不同的layer，那么convert类需要包含engine类的头文件，是不是不太合理？

这个宏会

提供统一add layer 的接口，而不需要为每种layer增加一个函数，比如 addFullyConnected 等

luotao1 · 2018-04-19T12:32:53Z

paddle/fluid/inference/tensorrt/engine.cc

+  PADDLE_ENFORCE(output != nullptr);
+  output->setName(name.c_str());
+  infer_network_->markOutput(*output);
+  buffer_sizes_[name] = 0;


为什么156行设成0？

luotao1 · 2018-04-19T12:33:43Z

paddle/fluid/inference/tensorrt/engine.cc

+}
+
+void*& TensorrtEngine::buffer(const std::string& name) {
+  PADDLE_ENFORCE(infer_engine_ != nullptr, "call freezenetwork first.");


freeze network 中间加空格

FreeNetwork

luotao1 · 2018-04-19T12:34:44Z

paddle/fluid/inference/tensorrt/engine.cc

+}
+
+void TensorrtEngine::DeclOutput(nvinfer1::ILayer* layer, int offset,
+                                const std::string& name) {


前两个参数也是const类型

const layer

luotao1 · 2018-04-19T12:35:19Z

paddle/fluid/inference/tensorrt/engine.cc

+}
+
+void TensorrtEngine::SetInputFromCPU(const std::string& name, void* data,
+                                     size_t size) {


size_t size也是const类型

luotao1 · 2018-04-19T12:47:30Z

paddle/fluid/inference/tensorrt/test_engine.cc

+  TensorrtEngine::Weight bias(TensorrtEngine::data_type::kFLOAT, raw_bias,
+                              size);
+  auto* x = engine_->DeclInput("x", TensorrtEngine::data_type::kFLOAT,
+                               nvinfer1::DimsCHW{1, 1, 1});


data_type没有必要包装一遍，可以直接用nvinfer原来的格式，更加清晰。这里nvinfer1::DimsCHW也是用的nvinfer的格式。

如果要包装的话，要在TensorrtEngine类下么，那convert类使用这些type的时候，要调用TensorrtEngine类？

weight类也是如此。

engine 的.cc 里应该要调用 convert，两者交叉调用应该没有问题。

luotao1 · 2018-04-19T13:07:50Z

paddle/fluid/inference/tensorrt/test_engine.cc

+  float x_v = 1234;
+  engine_->SetInputFromCPU("x", (void*)&x_v, 1 * sizeof(float));
+  LOG(INFO) << "to execute";
+  engine_->Execute(1);


Use the model to create an engine and an execution context这一部分不封装了么？

Paddle/paddle/fluid/inference/tensorrt/test_tensorrt.cc

Lines 138 to 149 in 1866597

Logger logger;

nvinfer1::IRuntime* runtime = createInferRuntime(logger);

nvinfer1::ICudaEngine* engine =

runtime->deserializeCudaEngine(model->data(), model->size(), nullptr);

model->destroy();

nvinfer1::IExecutionContext* context = engine->createExecutionContext();

// Execute the network.

float input = 1234;

float output;

Execute(*context, &input, &output);

EXPECT_EQ(output, input * 2 + 3);

这里就不需要 serialize 了

luotao1 · 2018-04-20T09:27:46Z

paddle/fluid/inference/tensorrt/engine.h

+ * There are two alternative ways to use it, one is  to build from a paddle
+ * protobuf model, another way is to manully construct the network.
+ */
+class TensorrtEngine : public EngineBase {


TensorrtEngine->TensorRTEngine，rt大写，和TensorRT一致。

luotao1

LGTM

luotao1 · 2018-04-25T12:22:18Z

paddle/fluid/inference/tensorrt/CMakeLists.txt

@@ -1 +1,4 @@
-nv_test(test_tensorrt SRCS test_tensorrt.cc DEPS dynload_cuda device_context dynamic_loader)
+if(WITH_TESTING)


这里可以不加if(WITH_TESTING)，因为在nv_test里面会做判断。可以之后的PR修改。

luotao1 · 2018-04-25T12:22:49Z

paddle/fluid/inference/tensorrt/helper.h

+    4   // kINT32
+};
+
+// The following two API are implemented in TensorRT's header file, cannot load


API-》APIs

Superjomn added 23 commits April 13, 2018 13:05

add tensorrt

a3140d3

set tensorrt on as default

a60189f

add cudnn dependency

b95d819

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into fea/add…

87fc090

…_tensorrt

nvtest

8dda580

add tensorrt dynamic loader

92480b5

add tensorrt as dyload

5891896

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into fea/add…

1b475b3

…_tensorrt

finish test

9d617b8

remove tensorrt.cmake

0e8e85f

dependency has been installed in docker image.

fix pip upgrade pip error

63b6a74

add flag definition for tensorrt_dir

d492547

code clean

4f0a2ab

add default so search path

e220226

update

5132a2b

Merge branch 'fea/add_tensorrt' into fea/tensorrt_engine

dc23dc5

change cmake config

1fe9f63

Merge branch 'fea/add_tensorrt' into fea/tensorrt_engine

cf4f092

init

f1b5040

init

9699574

update

aa7ab53

finish coding

1d13858

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into fea/ten…

4da8cbd

…sorrt_engine

Superjomn force-pushed the fea/tensorrt_engine branch 2 times, most recently from c64c5ac to 4da8cbd Compare April 18, 2018 02:33

Superjomn requested review from Xreki and luotao1 April 18, 2018 02:36

Xreki added the 预测原名Inference，包含Capi预测问题等 label Apr 18, 2018

Superjomn added 2 commits April 18, 2018 10:42

fix conflict on Dockerfile

5463325

add new get output apis

74ea1f6

Superjomn force-pushed the fea/tensorrt_engine branch 2 times, most recently from f7d80a2 to 74ea1f6 Compare April 18, 2018 05:28

Superjomn added 2 commits April 18, 2018 14:48

format code

610f290

Update networks.py

f273eef

Xreki reviewed Apr 18, 2018

View reviewed changes

Superjomn added 4 commits April 19, 2018 12:53

add inference namespace

57c0ddb

Merge branch 'fea/tensorrt_engine' of github.com:Superjomn/Paddle int…

25397ca

…o fea/tensorrt_engine

fix copyright

6d89b54

engine add namespace

97a34ac

luotao1 reviewed Apr 19, 2018

View reviewed changes

luotao1 reviewed Apr 20, 2018

View reviewed changes

luotao1 mentioned this pull request Apr 23, 2018

tensorrt convert init #10144

Merged

Superjomn added 3 commits April 24, 2018 14:59

change according to review

5b8de3b

wrap test

bbf19cb

add helper

4c0ce9d

luotao1 approved these changes Apr 25, 2018

View reviewed changes

Superjomn merged commit 2d57158 into PaddlePaddle:develop Apr 25, 2018

Superjomn deleted the fea/tensorrt_engine branch April 25, 2018 12:29

		#include "paddle/fluid/framework/framework.pb.h"

		namespace paddle {

	Logger logger;
	nvinfer1::IRuntime* runtime = createInferRuntime(logger);
	nvinfer1::ICudaEngine* engine =
	runtime->deserializeCudaEngine(model->data(), model->size(), nullptr);
	model->destroy();
	nvinfer1::IExecutionContext* context = engine->createExecutionContext();

	// Execute the network.
	float input = 1234;
	float output;
	Execute(*context, &input, &output);
	EXPECT_EQ(output, input * 2 + 3);

		@@ -1 +1,4 @@
		nv_test(test_tensorrt SRCS test_tensorrt.cc DEPS dynload_cuda device_context dynamic_loader)
		if(WITH_TESTING)

fea/init tensorrt engine #10003

fea/init tensorrt engine #10003

Uh oh!

Conversation

Superjomn commented Apr 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Xreki left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Superjomn commented Apr 19, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Superjomn Apr 24, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Superjomn Apr 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luotao1 Apr 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luotao1 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Superjomn commented Apr 18, 2018 •

edited

Loading

Superjomn Apr 24, 2018 •

edited

Loading

Superjomn Apr 20, 2018 •

edited

Loading

luotao1 Apr 19, 2018 •

edited

Loading