-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refine tensor doc and add module to_global doc #7823
Conversation
hjchen2
commented
Mar 17, 2022
•
edited
Loading
edited
- 完善Tensor.to_global接口文档,增加对grad_sbp的描述
- 完善Tensor.to_local接口文档
- 增加Tensor Attributes文档
- 增加Module.to_consistent和Module.to_global接口的文档
Cast a local tensor to global tensor or cast a | ||
global tensor to another global tensor with | ||
different sbp or placement | ||
Convert a local tensor to global tensor or convert a global tensor to another global tensor with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
拆成两段
sbp (flow.sbp.sbp or tuple of flow.sbp.sbp, optional): the desired sbp descriptor of returned global tensor. Default: if None, the input tensor must be consistent one and use its own sbp. | ||
placement (flow.placement, optional): the desired placement of returned global tensor. Default: if None, the input tensor must be global and use its own placement. | ||
sbp (flow.sbp.sbp or tuple of flow.sbp.sbp, optional): the desired sbp descriptor of returned global tensor. Default: if None, the input tensor must be global and use its own sbp. | ||
grad_sbp (flow.sbp.sbp or tuple of flow.sbp.sbp, optional): manually specify the gradient sbp of the operation in the backward pass. Default: if None, the gradient sbp will be infered automatically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example 需要至少增加一个例子。
local_tensor -> global tensor S0 tensor shape
从 release note 借鉴过来
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S0 的约束;
B 的约束; Note: B 会发生 rank 0 的数据覆盖
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用 api doc 的 NOTE 功能描述上述的重点约束
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
更新了,再review一下
Co-authored-by: Yao Chi <later@usopp.net>
Co-authored-by: Yao Chi <later@usopp.net>
Co-authored-by: Yao Chi <later@usopp.net>
docs/source/tensor.rst
Outdated
@@ -156,7 +156,8 @@ OneFlow Tensor Class | |||
tan, | |||
tanh, | |||
tile, | |||
to, | |||
to, | |||
to_consistent, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个是按照字典序排的吗,还是随机排的。 to_consistent 是过时接口,是不是放在 to_global 下面比较好。 字典序排的就不用改了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
挪到to_local下面去了
|
||
|
||
Note: | ||
This method modifies the module in-place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里要分情况?如果原有的 module 都是 local tensor,这里就不是 inplace 的吧。local tensor -> global tensor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,local to global无法Inplace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个是说原地修改module,不是改tensor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个inplace针对的对象是 module,针对于module是改变自身的操作
>>> m = flow.nn.Conv2d(in_channels=3, out_channels=4, kernel_size=3) | ||
>>> m.to_global(placement=flow.placement("cpu", ranks=[0]), sbp=[flow.sbp.split(0)]) | ||
>>> m.weight.is_global | ||
True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加一个:
>>> m.bias.is_global
True
总体上已经觉得写的很好了~ 值得推广 |
add_docstr( | ||
oneflow.nn.Module.to_consistent, | ||
""" | ||
This interface is no longer available, please use :func:`oneflow.nn.Module.to_global` instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead.
少个句号
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加上了
Args: | ||
placement (flow.placement, optional): the desired placement of returned global tensor. Default: None | ||
sbp (flow.sbp.sbp or tuple of flow.sbp.sbp, optional): the desired sbp of returned global tensor. Default: None | ||
grad_sbp (flow.sbp.sbp or tuple of flow.sbp.sbp, optional): manually specify the gradient sbp of this operation in the backward pass. If None, the gradient sbp will be infered automatically. It is only used if this tensor is a global tensor. Default: None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specify the sbp of this tensor's grad tensor in the backward pass ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
>>> # results on rank 0 | ||
oneflow.Size([2]) | ||
tensor([0., 1.], placement=oneflow.placement(type="cpu", ranks=[0, 1]), sbp=(oneflow.sbp.split(axis=0),), dtype=oneflow.float32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里b to s,除了sbp,看起来什么都没变化。
感觉需要体现一下里面的local tensor变化了,也就是自动做了split?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个在to_local 接口文档中有所体现,这里提及to_local是不是有点越界?
|
||
``oneflow.sbp`` includes three types: | ||
|
||
- oneflow.sbp.split(axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
发现split接口这里一个问题,torch下,基本都是叫dim
,tf下才叫axis
https://pytorch.org/docs/stable/search.html?q=dim&check_keywords=yes&area=default#
如果在考虑torch的语境的话,这里我们最好改为dim
A ``oneflow.sbp`` is an object representing that how the data of the global tensor is distributed across the ranks of the ``Tensor`` placement. | ||
|
||
``oneflow.sbp`` includes three types: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的表述有点奇怪,oneflow.sbp是一个python module,而oneflow.sbp.sbp是一个类。
A oneflow.sbp.sbp
is a distribution descriptor object representing how the data of the global tensor is distributed across the ranks of the Tensor
placement.
There are three types of distribution descriptor instances in module oneflow.sbp
:
…-Inc/oneflow into dev_refine_to_global_doc
…-Inc/oneflow into dev_refine_to_global_doc
…-Inc/oneflow into dev_refine_to_global_doc
CI failed when running job: cuda-module-distributed-rank-1. PR label automerge has been removed |
CI failed when running job: cuda-module-distributed-rank-0. PR label automerge has been removed |
Speed stats:
|