-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Add StridedCopy method #4205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add StridedCopy method #4205
Conversation
A method to copy a tensor with stride and dimension. It is useful for Crop, Concat, etc.
paddle/operators/tensor_copy.h
Outdated
| namespace operators { | ||
|
|
||
| // Copy a tensor from src to dst. | ||
| // The src and dst should be both on dev_ctx.GetPlace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/tensor_copy.h
Outdated
| // Copy a tensor from src to dst. | ||
| // The src and dst should be both on dev_ctx.GetPlace() | ||
| // | ||
| // the stride of an array (also referred to as increment, pitch or step size) is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the => The
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/tensor_copy_test.cc
Outdated
| ASSERT_EQ(4, dst[3]); | ||
| } | ||
|
|
||
| #ifndef PADDLE_ONLY_CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we need to split this file into two -- a .cc file and a .cu file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nop, all lines in this .cc file is invoking pure C++ methods. However, some of them are not defined when ONLY_CPU.
paddle/operators/tensor_copy.h
Outdated
| const framework::DDim& dst_dim, | ||
| const framework::DDim& dst_stride, T* dst) { | ||
| using namespace detail; | ||
| TensorCopyDimVisitor<T> func(dev_ctx, src, src_stride, dst_stride, dst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why define a functor and a visitor here? Could we define a function template instead? If we do have to use the visitor pattern, it would be great to put the reason in detail/tesnor_copy.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We must use visitor pattern here to make DDim to Dim.
The implementation of Visitor is in detail/tensor_copy.h.
Also, @Canpio and I think this method should be given a better name, like StridedMemcpy.
| auto& cpu_place = boost::get<platform::CPUPlace>(place); | ||
| memory::Copy(cpu_place, dst, cpu_place, src, sizeof(T) * dst_dim.head); | ||
| } else { | ||
| #ifndef PADDLE_ONLY_CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这样写会不会带来类似的问题:
#4202
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不会的,那个问题和这个写法没关系的。
Add unittests for Crop and Concat
| inline void StridedMemcpy(const platform::DeviceContext& dev_ctx, const T* src, | ||
| const framework::DDim& src_stride, | ||
| const framework::DDim& dst_dim, | ||
| const framework::DDim& dst_stride, T* dst) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to follow the parameter order convention in memcpy? BecauseStrideMemcpy is a name so similar to c system library function, cuda library function cudaMemcpy.
Or we may add a note to emphasize the difference.
Beside the question, I want to tocao,memcpy violates the google style. : <
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we should follow google C++ style. It follows the Google C++ style for parameter ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dzhwinter memcpy appeared much earlier than Google C++ style. So the fact is that Google C++ style violates memcpy.
| inline void StridedMemcpy(const platform::DeviceContext& dev_ctx, const T* src, | ||
| const framework::DDim& src_stride, | ||
| const framework::DDim& dst_dim, | ||
| const framework::DDim& dst_stride, T* dst) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can get dst_stride from dst_dim like this:
DDim stride(const DDim& ddim) {
std::vector<int64_t> strides(ddim.size());
strides[ddim.size()-1] = 1;
for (int i = ddims.size()-2; i>=0; --i) {
strides[i] = strides[i+1] * ddim[i+1]
}
return make_ddim(strides);
}
So dst_stride is not necessary in parameter list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we may provide another method to generate stride. However, src_stride and dst_stride are very useful in concat or other situations.
JiayiFeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The interface of calculating stride automatically can be added later.
dzhwinter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can merge it ASAP since some op porting work dependents on it.
A method to copy a tensor with stride and dimension. It is useful
for Crop, Concat, etc.