Support setitem in static mode #29708

liym27 · 2020-12-16T13:06:33Z

PR types

New features

PR changes

Others

Describe

Support assignment to a Variable in static mode. note: not deal with backward.

功能支持：静态图支持 slice 赋值，inplace操作。

key支持：整数、python slice
value支持：int, float, numpy.ndarray, Paddle Tensor
dtype: int32, int64, float32, float64, bool
其他：默认支持 expand

paddle.enable_static()
tensor_x = paddle.to_tensor(np.ones((2, 3)).astype(np.float32)) # [[1,1,1], [1,1,1]]
tensor_x[0] = 0    # tensor_x : [[0, 0, 0], [1 ,1, 1]]
tensor_x[0:1] = 0 # tensor_x : [[0, 0, 0], [1 ,1, 1]]

tensor_x[0] = np.array([3,3,3]) # tensor_x : [[3, 3, 3], [0, 0, 0]]
tensor_x[1] = paddle.ones([3]) # tensor_x : [[3, 3, 3], [1,1,1]]

对应动态图操作 #27471

TODO:

增加文档说明

paddle/fluid/operators/setitem_op.cc

zhhsplendid · 2020-12-18T09:14:23Z

paddle/fluid/operators/setitem_op.cc

+    PADDLE_ENFORCE_LT(
+        in_dims.size(), 7,
+        platform::errors::InvalidArgument(
+            "The rank of input should be less than 7, but received %d.",


Why is this limit?

Because EigenTensor.slice only supports Tensors with rank less than 7. And like slice op, here also limit the rank less than 7.

zhhsplendid · 2020-12-18T09:25:52Z

paddle/fluid/operators/setitem_op.cc

+namespace ops = paddle::operators;
+
+REGISTER_OPERATOR(
+    setitem, ops::SetitemOp, ops::SetitemOpMaker,


Should we name the operator "set_item" to match C++ style instead of Python style?

Yes, thanks. Now replace "set_item" with "set_value".

zhhsplendid · 2020-12-18T09:27:36Z

paddle/fluid/operators/setitem_op.h

+  void Compute(const framework::ExecutionContext& ctx) const {
+    const int rank = ctx.Output<framework::LoDTensor>("Out")->dims().size();
+
+    switch (rank) {


Can we simply write SetItemCompute<rank>(ctx); instead of switch case?

No. The template argument must be a constant, otherwise, an error will be reported at compile time

error: the value of ‘rank’ is not usable in a constant expression

paddle/fluid/operators/setitem_op.h

python/paddle/fluid/framework.py

zhhsplendid · 2020-12-18T09:42:22Z

paddle/fluid/operators/setitem_op.h

+    auto& eigen_place =
+        *ctx.template device_context<DeviceContext>().eigen_device();
+
+    out->ShareDataWith(*in);


We are not allowed to ShareDataWith output Tensor, reference: https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/addon_development/new_op/op_notes.html#sharedatawith

What's the purpose here?

I double checked the code, you are Sharing to accelerate here. But I'm afraid this can trigger some memory optimization error. (For example, it checks input Tensor is useless and delete it then output Tensor loses data) Can we use same Tensor as in and out to avoid ShareDataWith? Or can we have only one Tensor to do both Input and Output for setitem op? (In this case you only have ctx.Outputframework::LoDTensor("Out") but your input Tensor is also this one)

Here, input and output is the same Paddle.Tensor, so output->ShareDataWith(in) is called.

If deleting input, checking rank can not be at Compile-Time

python/paddle/fluid/framework.py

zhhsplendid · 2020-12-18T09:56:00Z

python/paddle/fluid/tests/unittests/test_setitem_op.py

+        self.data[0:, 1:2, :] = self.value
+
+
+# 2. Test different type of value: int, float, numpy.ndarray, Tensor


It seems create_test_value_numpy only test int numpy, why you commented test different types?

Thanks, added unittests for different dtypes. (Here just means type but not dtype. But exactly it is a lack of unittests for different dtype. )

zhhsplendid · 2020-12-18T09:56:27Z

python/paddle/fluid/tests/unittests/test_setitem_op.py

+create_test_value_numpy(TestSetitemItemSlice4)
+
+
+def create_test_value_tensor(parent):


It seems create_test_value_tensor only test int Tensor why you commented test different types above?

Thanks, added unittests for different dtypes.

zhhsplendid · 2020-12-18T10:07:39Z

paddle/fluid/operators/setitem_op.h

+    auto& eigen_place =
+        *ctx.template device_context<DeviceContext>().eigen_device();
+
+    out->ShareDataWith(*in);


I double checked the code, you are Sharing to accelerate here. But I'm afraid this can trigger some memory optimization error. (For example, it checks input Tensor is useless and delete it then output Tensor loses data) Can we use same Tensor as in and out to avoid ShareDataWith? Or can we have only one Tensor to do both Input and Output for setitem op? (In this case you only have ctx.Outputframework::LoDTensor("Out") but your input Tensor is also this one)

zhhsplendid · 2020-12-22T06:15:40Z

python/paddle/fluid/tests/unittests/test_setitem_op.py

+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Test setitem op in static mode


Please also change “setitem” and this file name

Thanks, done. And rename the file to test_set_value_op.py

zhhsplendid · 2020-12-22T06:16:02Z

python/paddle/fluid/tests/unittests/dygraph_to_static/test_slice.py

        self.dygraph_func = test_slice_in_for_loop


+class TestSetitem(TestSliceWithoutControlFlow):


Please also change this name

Thanks, done

zhhsplendid · 2020-12-22T06:29:22Z

paddle/fluid/operators/set_value_op.h

+ public:
+  void Compute(const framework::ExecutionContext& ctx) const {
+    const int rank = ctx.Output<framework::LoDTensor>("Out")->dims().size();
+    switch (rank) {


So amazing, I just knew it from this PR. This code is ugly. I can LGTM here now since I found C++ actually has to make template integer as constant as what you said, but we had better have alternative writing in the future.

Thanks. I add a TODO comment here.

zhhsplendid · 2020-12-22T06:48:31Z

paddle/fluid/operators/set_value_op.h

+    auto& eigen_place =
+        *ctx.template device_context<DeviceContext>().eigen_device();
+
+    out->ShareDataWith(*in);


I saw your reply that in and out are same Paddle.Tensor. Do you mean at program level they are same (i.e. tensor=set_value(tensor, xxx))? Then in my understand, our graph will change it to SSA graph where variable name isn't same (i.e. tensor1=set_value(tensor0, xxx)), then the graph doesn't include a circle (you know that our graph doesn't support circle, without SSA, there will be tensor <--> op circle)

So at PE and graph level, I guess out and in are still two different variables. So deleting in by memory optimization and out loses data may still happen in complex graph.

So I would still suggest one of following:

You can copy data, we can make it right and we try to speed up in the future version.

Deleting Input variable, then the graph will be complex, such as there will be two ops points to the output in graph:
op1 -> output <- set_value
In this case, you have to find a way to handle the running order of set_value is what you want.

For a simple solution, you can do 1.

Thanks, as discassed offline, I do 1 now.

zhhsplendid

LGTM

Superjomn · 2020-12-25T02:40:06Z

paddle/fluid/operators/set_value_op.h

+        break;
+      case 6:
+        SetValueCompute<6>(ctx);
+        break;


如果是7呢？

已添加检查

…dlePaddle#29708) 1. Type of index: int, slice(step must be 1). 2. Type of value: (1) int32, int64, float32, bool; (2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported> (3) paddle.Tensor(int32, int64, float32, float64, bool);

) (#30104) 1. Type of index: int, slice(step must be 1). 2. Type of value: (1) int32, int64, float32, bool; (2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported> (3) paddle.Tensor(int32, int64, float32, float64, bool);

liym27 force-pushed the dy2stat_inplace branch from f97b02f to 76718fc Compare December 17, 2020 07:00

liym27 requested a review from zhhsplendid December 17, 2020 11:33

liym27 added 5 commits December 18, 2020 16:49

Support setitem in static mode

f2bd7ff

Support value isTensor

d48cba0

Test dynamic to static

2b21ff6

Register cuda kernel

5467ac0

Support bool kernel. Polish code

72bc0c8

liym27 force-pushed the dy2stat_inplace branch from 1f8fb01 to 72bc0c8 Compare December 18, 2020 09:07

zhhsplendid reviewed Dec 18, 2020

View reviewed changes

liym27 added 3 commits December 21, 2020 14:20

Fix code and add unittests according to reviews

855afc9

setitem -> set_value

c030634

int -> int64; use sys.maxsize

7fd9557

zhhsplendid reviewed Dec 22, 2020

View reviewed changes

Fix code and add unittests according to reviews

5edca0f

zhhsplendid approved these changes Dec 23, 2020

View reviewed changes

luotao1 approved these changes Dec 23, 2020

View reviewed changes

liym27 merged commit 97e75ad into PaddlePaddle:develop Dec 23, 2020

Superjomn reviewed Dec 25, 2020

View reviewed changes

liym27 mentioned this pull request Jan 5, 2021

[cherry-pick 2.0][setitem] Support Tensor setitem in static mode (#29708) #30104

Merged

		self.data[0:, 1:2, :] = self.value


		# 2. Test different type of value: int, float, numpy.ndarray, Tensor

		create_test_value_numpy(TestSetitemItemSlice4)


		def create_test_value_tensor(parent):

		self.dygraph_func = test_slice_in_for_loop


		class TestSetitem(TestSliceWithoutControlFlow):

Support setitem in static mode #29708

Support setitem in static mode #29708

Uh oh!

Conversation

liym27 commented Dec 16, 2020 • edited by OliverLPH Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Describe

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liym27 Dec 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhhsplendid left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

liym27 commented Dec 16, 2020 •

edited by OliverLPH

Loading

liym27 Dec 22, 2020 •

edited

Loading