Skip to content

Add TensorRT Reformat-Free I/O Support. #942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

nvpohanh
Copy link
Contributor

@nvpohanh nvpohanh commented Dec 9, 2019

Changes:

  • Fix build errors when gRPC and HTTP are disabled.
  • Add TensorRT Reformat-Free I/O Support.
    • User needs to specified padded dimensions in the model description file, since TRTIS has the assumption that byte_size = product of dimensions in model description.
    • Change validation logic so that it won't throw error since TensorRT engine reports unpadded dimensions.

@@ -166,6 +197,15 @@ CompareDimsSupported(
continue;
}

// Pad channel dimension if necessary.
Copy link
Contributor

@tanmayv25 tanmayv25 Dec 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we query the strides from the engine to get the total buffer size to allocate for the tensor? Does this formulation always return the exact byte size of the reformatted tensor? Also, what about the cases with dynamic-shapes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think according to TRT engine APIs, the engine always return on padded dimensions. User needs to pad the C dimension when computing buffer size. Also, I don't think there is a TRT API for getting strides.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point me to the documentation you are referring to? The GetStrides() function is available per execution context as it might change with the dimensions set. As per my initial investigation and talks with some TensorRT engineers, we are supposed to use getBindingBytesPerComponents(), getBindingComponentsPerElement() and getStrides() APIs to obtain the correct byte size. If the engine returns padded dimension then it's much easier to attain the byte size for the tensor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tanmayv25 My bad. You are right. Let me fix this and use the more modern APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be your opinion on the dimensions in the TRTIS model description file? Should it be the unpadded or padded dimensions?

Copy link
Contributor

@tanmayv25 tanmayv25 Dec 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using dynamic shapes we would need unpadded dimensions to call setBindingDimensions() with. So, it would make sense to hold on to unpadded dimensions in model config description file. If we can internally determine the maximum buffer size within plan_backend to allocate for the inputs then it would be great. Otherwise a new field for padded dimensions can be added to the model config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But when I tried using unpadded dimensions in model config, I ran into some assertion errors in provider.cc that I didn't dare to change.

@GuanLuo I would propose that we close this PR for now, but keep in mind that this is a must-have feature if we end up using TRTIS. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that we should close the PR for now as it requires some thinking on how to expand the model config so that it fits into different cases. And depending on the progress, I think we can at least have it for fixed input shapes as the required information seems to be determined once the model is loaded.

@GuanLuo
Copy link
Contributor

GuanLuo commented Dec 9, 2019

Can you split it into two separate PRs (build fix, and reformat-free I/O) so that they can be reviewed and merged independently?

@nvpohanh nvpohanh force-pushed the dev-pohanh-trt-format-fix branch from e6964fc to aa1f1e5 Compare December 9, 2019 23:20
@nvpohanh nvpohanh changed the title Add TensorRT Reformat-Free I/O Support. Fix build errors when gRPC and HTTP are disabled. Add TensorRT Reformat-Free I/O Support. Dec 9, 2019
@nvpohanh
Copy link
Contributor Author

nvpohanh commented Dec 9, 2019

Split out the build error part to #943 . Keep this one only for TRT Reformat-free I/O support. Updated title.

@nvpohanh
Copy link
Contributor Author

Closing this PR. Will file an internal tracker instead.

@nvpohanh nvpohanh closed this Dec 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants