- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5.9k
Support setitem in static mode #29708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f97b02f    to
    76718fc      
    Compare
  
    1f8fb01    to
    72bc0c8      
    Compare
  
    | PADDLE_ENFORCE_LT( | ||
| in_dims.size(), 7, | ||
| platform::errors::InvalidArgument( | ||
| "The rank of input should be less than 7, but received %d.", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because EigenTensor.slice only supports Tensors with rank less than 7. And like slice op, here also limit the rank less than 7.
        
          
                paddle/fluid/operators/setitem_op.cc
              
                Outdated
          
        
      | namespace ops = paddle::operators; | ||
|  | ||
| REGISTER_OPERATOR( | ||
| setitem, ops::SetitemOp, ops::SetitemOpMaker, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we name the operator "set_item" to match C++ style instead of Python style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks. Now replace "set_item" with "set_value".
| void Compute(const framework::ExecutionContext& ctx) const { | ||
| const int rank = ctx.Output<framework::LoDTensor>("Out")->dims().size(); | ||
|  | ||
| switch (rank) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we simply write SetItemCompute<rank>(ctx); instead of switch case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. The template argument must be a constant, otherwise, an error will be reported at compile time
error: the value of ‘rank’ is not usable in a constant expression
        
          
                paddle/fluid/operators/setitem_op.h
              
                Outdated
          
        
      | auto& eigen_place = | ||
| *ctx.template device_context<DeviceContext>().eigen_device(); | ||
|  | ||
| out->ShareDataWith(*in); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not allowed to ShareDataWith output Tensor, reference: https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/addon_development/new_op/op_notes.html#sharedatawith
What's the purpose here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double checked the code, you are Sharing to accelerate here. But I'm afraid this can trigger some memory optimization error. (For example, it checks input Tensor is useless and delete it then output Tensor loses data) Can we use same Tensor as in and out to avoid ShareDataWith? Or can we have only one Tensor to do both Input and Output for setitem op? (In this case you only have ctx.Outputframework::LoDTensor("Out") but your input Tensor is also this one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, input and output is the same Paddle.Tensor, so output->ShareDataWith(in) is called.
If deleting input,  checking rank can not be at Compile-Time
| self.data[0:, 1:2, :] = self.value | ||
|  | ||
|  | ||
| # 2. Test different type of value: int, float, numpy.ndarray, Tensor | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems create_test_value_numpy only test int numpy, why you commented test different types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, added unittests for different dtypes. (Here just means type but not dtype. But exactly it is a lack of unittests for different dtype. )
| create_test_value_numpy(TestSetitemItemSlice4) | ||
|  | ||
|  | ||
| def create_test_value_tensor(parent): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems create_test_value_tensor only test int Tensor why you commented test different types above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, added unittests for different dtypes.
        
          
                paddle/fluid/operators/setitem_op.h
              
                Outdated
          
        
      | auto& eigen_place = | ||
| *ctx.template device_context<DeviceContext>().eigen_device(); | ||
|  | ||
| out->ShareDataWith(*in); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double checked the code, you are Sharing to accelerate here. But I'm afraid this can trigger some memory optimization error. (For example, it checks input Tensor is useless and delete it then output Tensor loses data) Can we use same Tensor as in and out to avoid ShareDataWith? Or can we have only one Tensor to do both Input and Output for setitem op? (In this case you only have ctx.Outputframework::LoDTensor("Out") but your input Tensor is also this one)
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|  | ||
| # Test setitem op in static mode | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also change “setitem” and this file name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done. And rename the file to test_set_value_op.py
| self.dygraph_func = test_slice_in_for_loop | ||
|  | ||
|  | ||
| class TestSetitem(TestSliceWithoutControlFlow): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also change this name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done
| public: | ||
| void Compute(const framework::ExecutionContext& ctx) const { | ||
| const int rank = ctx.Output<framework::LoDTensor>("Out")->dims().size(); | ||
| switch (rank) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So amazing, I just knew it from this PR. This code is ugly. I can LGTM here now since I found C++ actually has to make template integer as constant as what you said, but we had better have alternative writing in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I add a TODO comment here.
| auto& eigen_place = | ||
| *ctx.template device_context<DeviceContext>().eigen_device(); | ||
|  | ||
| out->ShareDataWith(*in); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw your reply that in and out are same Paddle.Tensor. Do you mean at program level they are same (i.e. tensor=set_value(tensor, xxx))? Then in my understand, our graph will change it to SSA graph where variable name isn't same (i.e. tensor1=set_value(tensor0, xxx)), then the graph doesn't include a circle (you know that our graph doesn't support circle, without SSA, there will be tensor <--> op circle)
So at PE and graph level, I guess out and in are still two different variables. So deleting in by memory optimization and out loses data may still happen in complex graph.
So I would still suggest one of following:
- You can copy data, we can make it right and we try to speed up in the future version.
- Deleting Input variable, then the graph will be complex, such as there will be two ops points to the output in graph:
 op1 -> output <- set_value
 In this case, you have to find a way to handle the running order of set_value is what you want.
For a simple solution, you can do 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, as discassed offline, I do 1 now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| break; | ||
| case 6: | ||
| SetValueCompute<6>(ctx); | ||
| break; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果是7呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已添加检查
…dlePaddle#29708) 1. Type of index: int, slice(step must be 1). 2. Type of value: (1) int32, int64, float32, bool; (2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported> (3) paddle.Tensor(int32, int64, float32, float64, bool);
) (#30104) 1. Type of index: int, slice(step must be 1). 2. Type of value: (1) int32, int64, float32, bool; (2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported> (3) paddle.Tensor(int32, int64, float32, float64, bool);
PR types
New features
PR changes
Others
Describe
Support assignment to a Variable in static mode. note: not deal with backward.
功能支持:静态图支持 slice 赋值,inplace操作。
对应动态图操作 #27471
TODO: