-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Add TensorRT Reformat-Free I/O Support. #942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TensorRT Reformat-Free I/O Support. #942
Conversation
@@ -166,6 +197,15 @@ CompareDimsSupported( | |||
continue; | |||
} | |||
|
|||
// Pad channel dimension if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we query the strides from the engine to get the total buffer size to allocate for the tensor? Does this formulation always return the exact byte size of the reformatted tensor? Also, what about the cases with dynamic-shapes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think according to TRT engine APIs, the engine always return on padded dimensions. User needs to pad the C dimension when computing buffer size. Also, I don't think there is a TRT API for getting strides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point me to the documentation you are referring to? The GetStrides() function is available per execution context as it might change with the dimensions set. As per my initial investigation and talks with some TensorRT engineers, we are supposed to use getBindingBytesPerComponents(), getBindingComponentsPerElement() and getStrides() APIs to obtain the correct byte size. If the engine returns padded dimension then it's much easier to attain the byte size for the tensor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanmayv25 My bad. You are right. Let me fix this and use the more modern APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be your opinion on the dimensions in the TRTIS model description file? Should it be the unpadded or padded dimensions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using dynamic shapes we would need unpadded dimensions to call setBindingDimensions() with. So, it would make sense to hold on to unpadded dimensions in model config description file. If we can internally determine the maximum buffer size within plan_backend to allocate for the inputs then it would be great. Otherwise a new field for padded dimensions can be added to the model config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But when I tried using unpadded dimensions in model config, I ran into some assertion errors in provider.cc that I didn't dare to change.
@GuanLuo I would propose that we close this PR for now, but keep in mind that this is a must-have feature if we end up using TRTIS. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree that we should close the PR for now as it requires some thinking on how to expand the model config so that it fits into different cases. And depending on the progress, I think we can at least have it for fixed input shapes as the required information seems to be determined once the model is loaded.
Can you split it into two separate PRs (build fix, and reformat-free I/O) so that they can be reviewed and merged independently? |
e6964fc
to
aa1f1e5
Compare
Split out the build error part to #943 . Keep this one only for TRT Reformat-free I/O support. Updated title. |
Closing this PR. Will file an internal tracker instead. |
Changes:
Fix build errors when gRPC and HTTP are disabled.