add: test convert dependency #6023

hhhfccz · 2021-08-24T08:06:37Z

tvm 和 oneflow_convert_tool 需要通过graph获取每一个节点的shape和dtype
目前对repr(garph)的依赖：

types = ["INPUT", "PARAMETER", "BUFFER", "OUTPUT"]

其中input和output应该为计算图的i/o, 命名类似_OneFlowGraph0-input_0
其中buffer类似batchnorm算子中的running_mean/var

同时对flow.load的返回值有依赖：

需要能够提取每层参数对应的路径，在tvm转换中需要依赖路径信息进行节点配对

CLAassistant · 2021-08-24T08:06:43Z

All committers have signed the CLA.

BBuf · 2021-08-24T09:32:42Z

python/oneflow/test/graph/test_convert_dependency.py

+@flow.unittest.skip_unless_1n1d()
+class TestConvertDependency(flow.unittest.TestCase):
+    def test_get_params(test_case):
+        model_dir_path = "alexnet_oneflow_model"


这个路径ci是默认有的吗

这个需要下载预训练的模型参数

strint · 2021-08-24T10:25:47Z

python/oneflow/test/graph/test_convert_dependency.py

+
+        p_size = re.compile(r"size=\(.*?\)", re.S)
+        p_type = re.compile(r"dtype=.*?,", re.S)
+        types = ["INPUT", "PARAMETER", "BUFFER", "OUTPUT"]


nn.Graph的input和output有这些类型可能出现：Tensor、None、TensorTuple、List[Tensor]
这里只考虑了Tensor？

不过repr里面的确把TensorTuple、List[Tensor]展开成Tensor了，参考这个pr：#5803

另外，这里的“通过graph获取每一个节点的shape和dtype”，repr这里只有graph和module级别的，没有op级别的，也不影响？

这里主要是获取到每一个节点的信息，在之前是可以通过job.helper获取的，但是现在helper好像是None。这里的input没有考虑None的情况，在转到tvm的过程中当input没有的时候在转换过程中会直接报错。关于op级别的节点信息在转换过程中会从repr(graph)解析出来的信息中提取，应该没有影响。

strint · 2021-08-24T10:29:17Z

python/oneflow/test/graph/test_convert_dependency.py

+            )
+        )
+        if not graph._is_compiled:
+            _ = graph._compile(flow.rand(shape_input))


_compile后面比如我们转为public接口，改成 compile，怎么处理，提示要match oneflow的版本？

0.5.0及之前我们以这个测试作为接口约定。

_compile后面比如我们转为public接口，改成 compile，怎么处理，提示要match oneflow的版本？

请问一下_compile转为compile是在本周内完成的吗，如果比较快的话这个以及后面的部分（获取所有nodes）可以先省略，等graph开发完全了再提一个PR补回来

短期内不改，在graph各种训练功能稳定后，才考虑把这个改为public接口。你可以赖现在这个。

strint · 2021-08-24T10:41:57Z

python/oneflow/test/graph/test_convert_dependency.py

+                if size_attr[-2] == ",":
+                    size_attr = size_attr.replace(",", "")
+                if type_attr[-1] == ",":
+                    type_str = type_attr.replace(",", "")


这个检查有点弱，只能保证哟内容，最好检查下内容是对的。

strint · 2021-08-24T10:44:50Z

python/oneflow/test/graph/test_convert_dependency.py

@@ -0,0 +1,105 @@
+"""


test_xx_convert_dependency.py

xx最好明确下

或者叫 test_api_dependency_on_graph.py

好的，这个脚本tvm转换和onnx转换都有用到所以一开始没有做区分

hhhfccz · 2021-08-25T06:17:41Z

@strint

更改了测试脚本命名
对提取的dtype作了float32的检查（关于shape内容的检查我目前没有想到比较好的方案）
对alexnet的params个数作了限制，这边提取的params是16个
有一个可能的问题，因为在TVM转换的过程中，转换batchnorm算子会使用到BUFFER，目前测试用的模型是alexnet没有涉及到这方面。我想问一下之前被标记为BUFFER的节点在今后的Graph中会被怎么处理，会标成PARAMETER吗

strint · 2021-08-25T08:42:49Z

@strint

更改了测试脚本命名

好的

对提取的dtype作了float32的检查（关于shape内容的检查我目前没有想到比较好的方案）

是不是可以选取一个tensor，写死对它的检查就好

对alexnet的params个数作了限制，这边提取的params是16个

好的

有一个可能的问题，因为在TVM转换的过程中，转换batchnorm算子会使用到BUFFER，目前测试用的模型是alexnet没有涉及到这方面。我想问一下之前被标记为BUFFER的节点在今后的Graph中会被怎么处理，会标成PARAMETER吗

可以自己构造一个module，里面注册一个：

self.register_buffer("dummy_buff", flow.Tensor(1, 4))  # 比如自己注册一个buffer，既可以验证buffer，又可以检查tensor shape

参见：oneflow/python/oneflow/test/graph/test_graph.py
在repr中是这样的：

(BUFFER:m.dummy_buff:tensor(..., size=(1, 4), dtype=oneflow.float32)): ()

resnet50中的bn

    (MODULE:resnet50.bn1:BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)): (                                                                                                
      (PARAMETER:resnet50.bn1.weight:tensor(...,                                                                                                                                                            
             placement=oneflow.placement(device_type="cuda", machine_device_ids={0 : [0, 1]}, hierarchy=(2,)),                                                                                              
             sbp=(oneflow.sbp.broadcast,), size=(64,), dtype=oneflow.float32,                                                                                                                               
             requires_grad=True)): ()                                                                                                                                                                       
      (PARAMETER:resnet50.bn1.bias:tensor(...,                                                                                                                                                              
             placement=oneflow.placement(device_type="cuda", machine_device_ids={0 : [0, 1]}, hierarchy=(2,)),                                                                                              
             sbp=(oneflow.sbp.broadcast,), size=(64,), dtype=oneflow.float32,                                                                                                                               
             requires_grad=True)): ()                                                                                                                                                                       
      (BUFFER:resnet50.bn1.running_mean:tensor(...,                                                                                                                                                         
             placement=oneflow.placement(device_type="cuda", machine_device_ids={0 : [0, 1]}, hierarchy=(2,)),                                                                                              
             sbp=(oneflow.sbp.broadcast,), size=(64,), dtype=oneflow.float32)): ()                                                                                                                          
      (BUFFER:resnet50.bn1.running_var:tensor(...,                                                                                                                                                          
             placement=oneflow.placement(device_type="cuda", machine_device_ids={0 : [0, 1]}, hierarchy=(2,)),                                                                                              
             sbp=(oneflow.sbp.broadcast,), size=(64,), dtype=oneflow.float32)): ()                                                                                                                          
    )

hhhfccz · 2021-08-26T04:30:27Z

@strint

更改了测试脚本命名

好的

对提取的dtype作了float32的检查（关于shape内容的检查我目前没有想到比较好的方案）

是不是可以选取一个tensor，写死对它的检查就好

对alexnet的params个数作了限制，这边提取的params是16个

好的

有一个可能的问题，因为在TVM转换的过程中，转换batchnorm算子会使用到BUFFER，目前测试用的模型是alexnet没有涉及到这方面。我想问一下之前被标记为BUFFER的节点在今后的Graph中会被怎么处理，会标成PARAMETER吗

可以自己构造一个module，里面注册一个：

self.register_buffer("dummy_buff", flow.Tensor(1, 4))  # 比如自己注册一个buffer，既可以验证buffer，又可以检查tensor shape

参见：oneflow/python/oneflow/test/graph/test_graph.py
在repr中是这样的：

(BUFFER:m.dummy_buff:tensor(..., size=(1, 4), dtype=oneflow.float32)): ()

resnet50中的bn

    (MODULE:resnet50.bn1:BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)): (                                                                                                
      (PARAMETER:resnet50.bn1.weight:tensor(...,                                                                                                                                                            
             placement=oneflow.placement(device_type="cuda", machine_device_ids={0 : [0, 1]}, hierarchy=(2,)),                                                                                              
             sbp=(oneflow.sbp.broadcast,), size=(64,), dtype=oneflow.float32,                                                                                                                               
             requires_grad=True)): ()                                                                                                                                                                       
      (PARAMETER:resnet50.bn1.bias:tensor(...,                                                                                                                                                              
             placement=oneflow.placement(device_type="cuda", machine_device_ids={0 : [0, 1]}, hierarchy=(2,)),                                                                                              
             sbp=(oneflow.sbp.broadcast,), size=(64,), dtype=oneflow.float32,                                                                                                                               
             requires_grad=True)): ()                                                                                                                                                                       
      (BUFFER:resnet50.bn1.running_mean:tensor(...,                                                                                                                                                         
             placement=oneflow.placement(device_type="cuda", machine_device_ids={0 : [0, 1]}, hierarchy=(2,)),                                                                                              
             sbp=(oneflow.sbp.broadcast,), size=(64,), dtype=oneflow.float32)): ()                                                                                                                          
      (BUFFER:resnet50.bn1.running_var:tensor(...,                                                                                                                                                          
             placement=oneflow.placement(device_type="cuda", machine_device_ids={0 : [0, 1]}, hierarchy=(2,)),                                                                                              
             sbp=(oneflow.sbp.broadcast,), size=(64,), dtype=oneflow.float32)): ()                                                                                                                          
    )

谢谢你的建议 @strint

添加了对buffer的检查
添加了对alexnet第一层conv2d.weights的检查和对最后一层linear.weights的检查
添加了获取nodes之后，对node attribute的提取检查

strint

lgtm

BBuf · 2021-08-26T05:40:10Z

可以改一下名字：test_api_dependency_on_graph.py -> test_tvm_fronted_api_dependency_on_graph.py

hhhfccz · 2021-08-26T05:46:17Z

可以改一下名字：test_api_dependency_on_graph.py -> test_tvm_fronted_api_dependency_on_graph.py

好的，改好了

github-actions · 2021-08-29T17:13:52Z

CI failed, removing label automerge

github-actions · 2021-08-29T18:48:30Z

CI failed, removing label automerge

github-actions · 2021-08-30T00:22:24Z

CI failed, removing label automerge

github-actions · 2021-09-06T15:17:30Z

CI failed, removing label automerge

github-actions · 2021-09-07T02:59:32Z

Speed stats:

GPU Name: GeForce GTX 1080 

OneFlow resnet50 time: 128.5ms (= 6423.1ms / 50, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 141.4ms (= 7070.6ms / 50, input_shape=[16, 3, 224, 224])
Relative speed: 1.10 (= 141.4ms / 128.5ms)

OneFlow resnet50 time: 74.7ms (= 3734.7ms / 50, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 83.3ms (= 4164.5ms / 50, input_shape=[8, 3, 224, 224])
Relative speed: 1.12 (= 83.3ms / 74.7ms)

OneFlow resnet50 time: 47.5ms (= 2374.0ms / 50, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 60.9ms (= 3046.1ms / 50, input_shape=[4, 3, 224, 224])
Relative speed: 1.28 (= 60.9ms / 47.5ms)

OneFlow resnet50 time: 39.1ms (= 1955.9ms / 50, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 50.0ms (= 2501.4ms / 50, input_shape=[2, 3, 224, 224])
Relative speed: 1.28 (= 50.0ms / 39.1ms)

OneFlow resnet50 time: 34.4ms (= 1720.7ms / 50, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 44.7ms (= 2232.5ms / 50, input_shape=[1, 3, 224, 224])
Relative speed: 1.30 (= 44.7ms / 34.4ms)

OneFlow resnet50 time: 152.6ms (= 7628.3ms / 50, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 162.7ms (= 8134.4ms / 50, input_shape=[16, 3, 224, 224], ddp, world size=2)
Relative speed: 1.07 (= 162.7ms / 152.6ms)

OneFlow resnet50 time: 100.9ms (= 5047.5ms / 50, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 103.8ms (= 5192.2ms / 50, input_shape=[8, 3, 224, 224], ddp, world size=2)
Relative speed: 1.03 (= 103.8ms / 100.9ms)

OneFlow resnet50 time: 78.0ms (= 3899.2ms / 50, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 81.4ms (= 4069.3ms / 50, input_shape=[4, 3, 224, 224], ddp, world size=2)
Relative speed: 1.04 (= 81.4ms / 78.0ms)

OneFlow resnet50 time: 68.7ms (= 3436.5ms / 50, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.0ms (= 3601.1ms / 50, input_shape=[2, 3, 224, 224], ddp, world size=2)
Relative speed: 1.05 (= 72.0ms / 68.7ms)

OneFlow resnet50 time: 68.0ms (= 3399.7ms / 50, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 61.5ms (= 3076.3ms / 50, input_shape=[1, 3, 224, 224], ddp, world size=2)
Relative speed: 0.90 (= 61.5ms / 68.0ms)

add: test convert dependency

00ad17c

BBuf reviewed Aug 24, 2021

View reviewed changes

BBuf requested a review from strint August 24, 2021 09:33

strint reviewed Aug 24, 2021

View reviewed changes

fix: change name, add dtype test and num_of_params test

278fc2d

add: test of nodes and buffer

8e4da74

strint approved these changes Aug 26, 2021

View reviewed changes

change name

fcfdbc1

Merge branch 'master' into convert_dependency

69981bf

BBuf added api automerge enhancement good for pr interface labels Aug 26, 2021

BBuf requested a review from oneflow-ci-bot August 26, 2021 09:27

oneflow-ci-bot and others added 2 commits August 26, 2021 09:29

auto format by CI

dc5cf85

Merge branch 'master' into convert_dependency

e41ffa7

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 26, 2021 11:52

BBuf added the eager label Aug 26, 2021

Merge branch 'master' into convert_dependency

1dfa591

github-actions bot removed the automerge label Aug 29, 2021

oneflow-ci-bot removed their request for review August 29, 2021 17:15

hhhfccz added the automerge label Aug 29, 2021

github-actions bot removed the automerge label Aug 30, 2021

Update test_tvm_frontend_dependency_on_graph.py

78392f6

hhhfccz added the automerge label Sep 6, 2021

Merge branch 'master' into convert_dependency

5a0ad08

oneflow-ci-bot self-requested a review September 6, 2021 13:52

auto format by CI

6466edf

oneflow-ci-bot removed their request for review September 6, 2021 14:17

Merge branch 'master' into convert_dependency

5be487d

oneflow-ci-bot self-requested a review September 6, 2021 14:17

github-actions bot removed the automerge label Sep 6, 2021

oneflow-ci-bot removed their request for review September 6, 2021 15:18

Update test_tvm_frontend_dependency_on_graph.py

45bef32

hhhfccz added the automerge label Sep 6, 2021

Merge branch 'master' into convert_dependency

794cd96

oneflow-ci-bot self-requested a review September 6, 2021 16:14

oneflow-ci-bot and others added 2 commits September 6, 2021 16:15

auto format by CI

2026e5d

Merge branch 'master' into convert_dependency

7124bba

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 7, 2021 01:49

oneflow-ci-bot removed their request for review September 7, 2021 03:02

oneflow-ci-bot merged commit 4b3dc88 into master Sep 7, 2021

oneflow-ci-bot deleted the convert_dependency branch September 7, 2021 03:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add: test convert dependency #6023

add: test convert dependency #6023

hhhfccz commented Aug 24, 2021

CLAassistant commented Aug 24, 2021 •

edited

Loading

BBuf Aug 24, 2021

hhhfccz Aug 24, 2021

strint Aug 24, 2021 •

edited

Loading

strint Aug 24, 2021

hhhfccz Aug 24, 2021

strint Aug 24, 2021

strint Aug 24, 2021 •

edited

Loading

hhhfccz Aug 24, 2021

strint Aug 24, 2021

hhhfccz Aug 24, 2021

strint Aug 24, 2021

strint Aug 24, 2021 •

edited

Loading

hhhfccz Aug 24, 2021

hhhfccz commented Aug 25, 2021

strint commented Aug 25, 2021 •

edited

Loading

hhhfccz commented Aug 26, 2021

strint left a comment

BBuf commented Aug 26, 2021

hhhfccz commented Aug 26, 2021

github-actions bot commented Aug 29, 2021

github-actions bot commented Aug 29, 2021

github-actions bot commented Aug 30, 2021

github-actions bot commented Sep 6, 2021

github-actions bot commented Sep 7, 2021

add: test convert dependency #6023

add: test convert dependency #6023

Conversation

hhhfccz commented Aug 24, 2021

CLAassistant commented Aug 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

strint Aug 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

strint Aug 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

strint Aug 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hhhfccz commented Aug 25, 2021

strint commented Aug 25, 2021 • edited Loading

hhhfccz commented Aug 26, 2021

strint left a comment

Choose a reason for hiding this comment

BBuf commented Aug 26, 2021

hhhfccz commented Aug 26, 2021

github-actions bot commented Aug 29, 2021

github-actions bot commented Aug 29, 2021

github-actions bot commented Aug 30, 2021

github-actions bot commented Sep 6, 2021

github-actions bot commented Sep 7, 2021

CLAassistant commented Aug 24, 2021 •

edited

Loading

strint Aug 24, 2021 •

edited

Loading

strint Aug 24, 2021 •

edited

Loading

strint Aug 24, 2021 •

edited

Loading

strint commented Aug 25, 2021 •

edited

Loading