Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move pybind/python api to cpython [part2] #8315

Merged
merged 48 commits into from
Jun 1, 2022

Conversation

marigoold
Copy link
Contributor

@marigoold marigoold commented May 26, 2022

本PR完成了:

这部分API的搬运思路如下:

  • 实现了工具函数 concat_self ,用来把Tensor self和args拼接起来,以便生成的函数接口调用
  • 对于参数解析比较复杂的函数,用一个宏 DIRECT_PASS_FUNC 来创建函数,统一调用 functional_api.yaml.pybind.h 里面的接口
  • 对于参数解析比较简单的函数,在Python C api层就完成参数解析,然后调用functional接口
  • 对于 Tensor.int/float 等转dtype的函数,用宏 DATATYPE_FUNC 创建(虽然只有四个),不调用 functional_api.yaml.pybind.h 里面的接口

其他:

  • 对于部分functional_api.yaml中参数名和PyTorch对应函数参数名不对齐的函数,改成了和PyTorch相同的参数名
  • cast函数原来没有pin_memory参数,现在增加

这个部分迁移的api如下:

  • Tensor.ndim = property(_ndim)
  • Tensor.numpy = _numpy
  • Tensor.size = _size
  • Tensor.backward = _backward
  • Tensor.setitem = _setitem
  • Tensor.str = _str
  • Tensor.repr = _repr
  • Tensor.bool = is_nonzero
  • Tensor.iadd = _iadd
  • Tensor.addmm = _addmm
  • Tensor.format = _format
  • Tensor.index = _index
  • Tensor.float = _scalar_float
  • Tensor.int = _scalar_int
  • Tensor.array = _numpy
  • Tensor.uniform_ = _uniform
  • Tensor.trunc_normal_ = trunc_normal
  • Tensor.kaiming_uniform_ = _kaiming_uniform
  • Tensor.kaiming_normal_ = _kaiming_normal
  • Tensor.xavier_normal_ = _xavier_normal
  • Tensor.xavier_uniform_ = _xavier_uniform
  • Tensor.orthogonal_ = _orthogonal
  • Tensor.normal_ = _normal
  • Tensor.fill_ = _fill
  • Tensor.copy_ = _copy
  • Tensor._meta_repr = _meta_repr
  • Tensor.floor_divide = _floor_divide
  • Tensor.argmax = _argmax
  • Tensor.argmin = _argmin
  • Tensor.argsort = _argsort
  • Tensor.argwhere = _argwhere
  • Tensor.amin = _amin
  • Tensor.atan2 = _atan2
  • Tensor.gt = _gt
  • Tensor.ge = _ge
  • Tensor.cast = _cast
  • Tensor.diag = _diag
  • Tensor.diagonal = _diagonal
  • Tensor.add = _add
  • Tensor.add_ = _add_inplace
  • Tensor.addcmul = _addcmul
  • Tensor.addcmul_ = addcmul
  • Tensor.div = _truediv
  • Tensor.div_ = _truediv_inplace
  • Tensor.mul = _mul
  • Tensor.mul_ = mul
  • Tensor.sub = _sub
  • Tensor.sub_ = _sub_inplace
  • Tensor.clamp = _clamp
  • Tensor.clamp_ = clamp
  • Tensor.clip = _clip
  • Tensor.clip_ = clip
  • Tensor.cpu = _cpu
  • Tensor.cuda = _cuda
  • Tensor.expand = _expand
  • Tensor.expand_as = _expand_as
  • Tensor.fmod = _fmod
  • Tensor.flatten = _flatten
  • Tensor.flip = _flip
  • Tensor.in_top_k = _in_top_k
  • Tensor.index_select = _index_select
  • Tensor.minimum = _minimum
  • Tensor.maximum = _maximum
  • Tensor.new_empty = _new_empty
  • Tensor.new_ones = _new_ones
  • Tensor.new_zeros = _new_zeros
  • Tensor.pow = _pow
  • Tensor.var = _var
  • Tensor.std = _std
  • Tensor.matmul = _matmul
  • Tensor.softplus = _softplus
  • Tensor.tril = _tril
  • Tensor.triu = _triu
  • Tensor.where = _where
  • Tensor.norm = _norm
  • Tensor.transpose = _transpose
  • Tensor.permute = _permute
  • Tensor.local_to_global = _local_to_global
  • Tensor.global_to_global = _global_to_global
  • Tensor.to_global = _to_global
  • Tensor.relu = _relu
  • Tensor.relu_ = _relu_inplace
  • Tensor.softmax = _softmax
  • Tensor.log_softmax = _log_softmax
  • Tensor.logical_and = _and
  • Tensor.logical_or = _or
  • Tensor.logical_not = _not
  • Tensor.logical_xor = _xor
  • Tensor.roll = _roll
  • Tensor.bmm = _bmm
  • Tensor.chunk = _chunk
  • Tensor.repeat = _repeat
  • Tensor.tile = _tile
  • Tensor.split = _split
  • Tensor.unbind = _unbind
  • Tensor.squeeze = _squeeze
  • Tensor.swapaxes = _swapaxes
  • Tensor.amax = _amax
  • Tensor.swapdims = _swapdims
  • Tensor.unfold = _unfold
  • Tensor.narrow = _narrow
  • Tensor.unsqueeze = _unsqueeze
  • Tensor.to = _to
  • Tensor.half = _half
  • Tensor.gather = _gather
  • Tensor.all = _all
  • Tensor.any = _any
  • Tensor.T = property(_T)
  • Tensor.masked_fill = _masked_fill
  • Tensor.masked_select = _masked_select
  • Tensor.eq = _eq
  • Tensor.ne = _ne
  • Tensor.item = _item
  • Tensor.lt = _lt
  • Tensor.le = _le
  • Tensor.to_local = _to_local
  • Tensor.reshape = _reshape
  • Tensor.reshape_as = _reshape_as
  • Tensor.view = _view
  • Tensor.view_as = _view_as
  • Tensor.sort = _sort
  • Tensor.type_as = _type_as
  • Tensor.tolist = _tolist
  • Tensor.int = _int
  • Tensor.long = _long
  • Tensor.float = _float
  • Tensor.double = _double
  • Tensor.is_floating_point = _is_floating_point
  • Tensor.topk = _topk
  • Tensor.nms = _nms
  • Tensor.nonzero = _nonzero
  • Tensor.max = _max
  • Tensor.min = _min
  • Tensor.median = _median
  • Tensor.sum = _sum
  • Tensor.mean = _mean
  • Tensor.prod = _prod
  • Tensor.is_consistent = _is_consistent
  • Tensor.to_consistent = _to_consistent
  • Tensor.new_tensor = _new_tensor
  • Tensor.cumsum = _cumsum
  • Tensor.cumprod = _cumprod

@marigoold marigoold requested a review from jackalcooper as a code owner May 31, 2022 06:08
@marigoold marigoold changed the title [WIP] Move pybind/python api to cpython [part2] Move pybind/python api to cpython [part2] May 31, 2022
@marigoold marigoold requested a review from oneflow-ci-bot May 31, 2022 08:34
@github-actions
Copy link
Contributor

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

@marigoold marigoold requested review from oneflow-ci-bot and removed request for oneflow-ci-bot May 31, 2022 08:36
@github-actions
Copy link
Contributor

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8315/

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: NVIDIA GeForce GTX 1080 

❌ OneFlow resnet50 time: 130.0ms (= 13004.2ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.7ms (= 14371.9ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.11 (= 143.7ms / 130.0ms)

OneFlow resnet50 time: 80.4ms (= 8042.2ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 84.9ms (= 8489.5ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.06 (= 84.9ms / 80.4ms)

OneFlow resnet50 time: 53.7ms (= 10749.2ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 57.8ms (= 11553.1ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.07 (= 57.8ms / 53.7ms)

OneFlow resnet50 time: 42.9ms (= 8575.6ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 45.9ms (= 9186.2ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.07 (= 45.9ms / 42.9ms)

OneFlow resnet50 time: 36.3ms (= 7255.3ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 40.7ms (= 8131.1ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.12 (= 40.7ms / 36.3ms)

OneFlow swin dataloader time: 0.249s (= 49.819s / 200, num_workers=1)
PyTorch swin dataloader time: 0.151s (= 30.288s / 200, num_workers=1)
Relative speed: 0.608 (= 0.151s / 0.249s)

OneFlow swin dataloader time: 0.065s (= 13.022s / 200, num_workers=4)
PyTorch swin dataloader time: 0.042s (= 8.367s / 200, num_workers=4)
Relative speed: 0.643 (= 0.042s / 0.065s)

OneFlow swin dataloader time: 0.036s (= 7.198s / 200, num_workers=8)
PyTorch swin dataloader time: 0.023s (= 4.575s / 200, num_workers=8)
Relative speed: 0.636 (= 0.023s / 0.036s)

❌ OneFlow resnet50 time: 146.7ms (= 14675.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 172.3ms (= 17226.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 172.3ms / 146.7ms)

OneFlow resnet50 time: 96.3ms (= 9634.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 112.7ms (= 11266.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 112.7ms / 96.3ms)

OneFlow resnet50 time: 72.7ms (= 14543.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 89.8ms (= 17950.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.23 (= 89.8ms / 72.7ms)

OneFlow resnet50 time: 57.2ms (= 11446.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 74.1ms (= 14821.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.29 (= 74.1ms / 57.2ms)

OneFlow resnet50 time: 54.9ms (= 10988.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.5ms (= 13905.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.27 (= 69.5ms / 54.9ms)

@hjchen2 hjchen2 enabled auto-merge (squash) June 1, 2022 01:45
auto shape = PyTensor_Unpack(self)->shape();
if (idx_obj == NULL || idx_obj == Py_None) return TensorSize_NewFromShape(*shape);
int64_t idx = PyLong_AsLongLong(idx_obj);
CHECK_OR_THROW(idx >= -shape->NumAxes() && idx < shape->NumAxes())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shape->NumAxes()用了很多次,建议提取出来避免重复调用

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已改

@hjchen2 hjchen2 merged commit 6e8f93a into master Jun 1, 2022
@hjchen2 hjchen2 deleted the move_tensor_api_to_cpython_part2 branch June 1, 2022 07:01
Tensor.int = _int
Tensor.long = _long
Tensor.float = _float
Tensor.double = _double
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同时也要把这里不用的接口也删一下,比如_double、_float这些

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants