Skip to content

Conversation

@liym27
Copy link
Contributor

@liym27 liym27 commented Dec 16, 2020

PR types

New features

PR changes

Others

Describe

Support assignment to a Variable in static mode. note: not deal with backward.


功能支持:静态图支持 slice 赋值,inplace操作。

  • key支持:整数、python slice
  • value支持:int, float, numpy.ndarray, Paddle Tensor
  • dtype: int32, int64, float32, float64, bool
  • 其他:默认支持 expand
paddle.enable_static()
tensor_x = paddle.to_tensor(np.ones((2, 3)).astype(np.float32)) # [[1,1,1], [1,1,1]]
tensor_x[0] = 0    # tensor_x : [[0, 0, 0], [1 ,1, 1]]
tensor_x[0:1] = 0 # tensor_x : [[0, 0, 0], [1 ,1, 1]]

tensor_x[0] = np.array([3,3,3]) # tensor_x : [[3, 3, 3], [0, 0, 0]]
tensor_x[1] = paddle.ones([3]) # tensor_x : [[3, 3, 3], [1,1,1]]

对应动态图操作 #27471


TODO:

  • 增加文档说明

PADDLE_ENFORCE_LT(
in_dims.size(), 7,
platform::errors::InvalidArgument(
"The rank of input should be less than 7, but received %d.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this limit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because EigenTensor.slice only supports Tensors with rank less than 7. And like slice op, here also limit the rank less than 7.

namespace ops = paddle::operators;

REGISTER_OPERATOR(
setitem, ops::SetitemOp, ops::SetitemOpMaker,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we name the operator "set_item" to match C++ style instead of Python style?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks. Now replace "set_item" with "set_value".

void Compute(const framework::ExecutionContext& ctx) const {
const int rank = ctx.Output<framework::LoDTensor>("Out")->dims().size();

switch (rank) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simply write SetItemCompute<rank>(ctx); instead of switch case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The template argument must be a constant, otherwise, an error will be reported at compile time

error: the value of ‘rank’ is not usable in a constant expression

auto& eigen_place =
*ctx.template device_context<DeviceContext>().eigen_device();

out->ShareDataWith(*in);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not allowed to ShareDataWith output Tensor, reference: https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/addon_development/new_op/op_notes.html#sharedatawith

What's the purpose here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked the code, you are Sharing to accelerate here. But I'm afraid this can trigger some memory optimization error. (For example, it checks input Tensor is useless and delete it then output Tensor loses data) Can we use same Tensor as in and out to avoid ShareDataWith? Or can we have only one Tensor to do both Input and Output for setitem op? (In this case you only have ctx.Outputframework::LoDTensor("Out") but your input Tensor is also this one)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, input and output is the same Paddle.Tensor, so output->ShareDataWith(in) is called.

If deleting input, checking rank can not be at Compile-Time

self.data[0:, 1:2, :] = self.value


# 2. Test different type of value: int, float, numpy.ndarray, Tensor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems create_test_value_numpy only test int numpy, why you commented test different types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added unittests for different dtypes. (Here just means type but not dtype. But exactly it is a lack of unittests for different dtype. )

create_test_value_numpy(TestSetitemItemSlice4)


def create_test_value_tensor(parent):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems create_test_value_tensor only test int Tensor why you commented test different types above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added unittests for different dtypes.

auto& eigen_place =
*ctx.template device_context<DeviceContext>().eigen_device();

out->ShareDataWith(*in);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked the code, you are Sharing to accelerate here. But I'm afraid this can trigger some memory optimization error. (For example, it checks input Tensor is useless and delete it then output Tensor loses data) Can we use same Tensor as in and out to avoid ShareDataWith? Or can we have only one Tensor to do both Input and Output for setitem op? (In this case you only have ctx.Outputframework::LoDTensor("Out") but your input Tensor is also this one)

# See the License for the specific language governing permissions and
# limitations under the License.

# Test setitem op in static mode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also change “setitem” and this file name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done. And rename the file to test_set_value_op.py

self.dygraph_func = test_slice_in_for_loop


class TestSetitem(TestSliceWithoutControlFlow):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also change this name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done

public:
void Compute(const framework::ExecutionContext& ctx) const {
const int rank = ctx.Output<framework::LoDTensor>("Out")->dims().size();
switch (rank) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So amazing, I just knew it from this PR. This code is ugly. I can LGTM here now since I found C++ actually has to make template integer as constant as what you said, but we had better have alternative writing in the future.

Copy link
Contributor Author

@liym27 liym27 Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I add a TODO comment here.

auto& eigen_place =
*ctx.template device_context<DeviceContext>().eigen_device();

out->ShareDataWith(*in);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw your reply that in and out are same Paddle.Tensor. Do you mean at program level they are same (i.e. tensor=set_value(tensor, xxx))? Then in my understand, our graph will change it to SSA graph where variable name isn't same (i.e. tensor1=set_value(tensor0, xxx)), then the graph doesn't include a circle (you know that our graph doesn't support circle, without SSA, there will be tensor <--> op circle)

So at PE and graph level, I guess out and in are still two different variables. So deleting in by memory optimization and out loses data may still happen in complex graph.

So I would still suggest one of following:

  1. You can copy data, we can make it right and we try to speed up in the future version.
  2. Deleting Input variable, then the graph will be complex, such as there will be two ops points to the output in graph:
    op1 -> output <- set_value
    In this case, you have to find a way to handle the running order of set_value is what you want.

For a simple solution, you can do 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, as discassed offline, I do 1 now.

Copy link
Member

@zhhsplendid zhhsplendid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@liym27 liym27 merged commit 97e75ad into PaddlePaddle:develop Dec 23, 2020
break;
case 6:
SetValueCompute<6>(ctx);
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果是7呢?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已添加检查

liym27 added a commit to liym27/Paddle that referenced this pull request Jan 6, 2021
…dlePaddle#29708)

1. Type of index: int, slice(step must be 1).

2. Type of value:
 (1) int32, int64, float32, bool;
 (2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported>
 (3) paddle.Tensor(int32, int64, float32, float64, bool);
lanxianghit pushed a commit that referenced this pull request Jan 8, 2021
) (#30104)

1. Type of index: int, slice(step must be 1).

2. Type of value:
 (1) int32, int64, float32, bool;
 (2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported>
 (3) paddle.Tensor(int32, int64, float32, float64, bool);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants