Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fea/nn graph/forward graph #5516

Merged
merged 42 commits into from
Jul 21, 2021
Merged

Fea/nn graph/forward graph #5516

merged 42 commits into from
Jul 21, 2021

Conversation

strint
Copy link
Contributor

@strint strint commented Jul 15, 2021

forward job/graph

  • add input
  • add variable/buffer
  • add output
  • add user op
  • 拆分了graph/block/optimizer到3个文件

测试例子:

  • 多输入、多输出
  • 带有Parameter和Buffer
  • 带有rule/matmul/flatten三种user op
  • Tensor和Module都to到了cuda
  • 嵌套的Module
m = CustomModule()
m.to("cuda")
g = CustomGraph(m)

x = flow.Tensor(6, 6)
flow.nn.init.uniform_(x, a=-1.0, b=1.0)
x = x.to("cuda")

y = flow.Tensor(10, 10)
flow.nn.init.uniform_(y, a=-1.0, b=1.0)
y = y.to("cuda")

z, a = g._compile(x, y)

print("graph repr: ", repr(g))
print("graph proto: ", g._graph_proto)

输出:
graph repr:

(CustomGraph_0:CustomGraph:GRAPH): (
  (m:CustomModule:MODULE): (
    (m.layer:SubModule:MODULE): (
      (m.layer.relu:ReLU:MODULE): ()
      (m.layer.weight:Parameter:PARAMETER): ()
    )
    (m.dummy_buff:Tensor:BUFFER): ()
  )
)

graph proto :

net {
  op {
    name: "input_0"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427461631
    input_conf {
      out: "out"
      blob_conf {
        shape {
          dim: 6
          dim: 6
        }
        data_type: kFloat
        is_dynamic: true
        parallel_distribution {
          sbp_parallel {
            broadcast_parallel {
            }
          }
        }
      }
    }
  }
  op {
    name: "input_1"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427461631
    input_conf {
      out: "out"
      blob_conf {
        shape {
          dim: 10
          dim: 10
        }
        data_type: kFloat
        is_dynamic: true
        parallel_distribution {
          sbp_parallel {
            broadcast_parallel {
            }
          }
        }
      }
    }
  }
  op {
    name: "m.layer.weight"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427473919
    variable_conf {
      out: "out"
      shape {
        dim: 6
        dim: 6
      }
      data_type: kFloat
      initializer {
        empty_conf {
        }
      }
    }
  }
  op {
    name: "m.layer-matmul_0"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427469823
    user_conf {
      op_type_name: "matmul"
      input {
        key: "a"
        value {
          s: "input_0/out"
        }
      }
      input {
        key: "b"
        value {
          s: "m.layer.weight/out"
        }
      }
      output {
        key: "out"
        value {
          s: "m.layer-matmul_0/out_0"
        }
      }
      attr {
        key: "alpha"
        value {
          at_double: 1.0
        }
      }
      attr {
        key: "transpose_a"
        value {
          at_bool: false
        }
      }
      attr {
        key: "transpose_b"
        value {
          at_bool: false
        }
      }
    }
  }
  op {
    name: "m.layer.relu-relu_1"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427478015
    user_conf {
      op_type_name: "relu"
      input {
        key: "in"
        value {
          s: "m.layer-matmul_0/out_0"
        }
      }
      output {
        key: "out"
        value {
          s: "m.layer.relu-relu_1/out_0"
        }
      }
    }
  }
  op {
    name: "m.layer.relu-relu_2"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427478015
    user_conf {
      op_type_name: "relu"
      input {
        key: "in"
        value {
          s: "input_1/out"
        }
      }
      output {
        key: "out"
        value {
          s: "m.layer.relu-relu_2/out_0"
        }
      }
    }
  }
  op {
    name: "m-flatten_3"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427465727
    user_conf {
      op_type_name: "flatten"
      input {
        key: "in"
        value {
          s: "m.layer.relu-relu_1/out_0"
        }
      }
      output {
        key: "out"
        value {
          s: "m-flatten_3/out_0"
        }
      }
      attr {
        key: "end_dim"
        value {
          at_int32: -1
        }
      }
      attr {
        key: "start_dim"
        value {
          at_int32: 1
        }
      }
    }
  }
  op {
    name: "m.dummy_buff"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427482111
    variable_conf {
      out: "out"
      shape {
        dim: 6
        dim: 8
      }
      data_type: kFloat
      initializer {
        empty_conf {
        }
      }
      trainable: false
    }
  }
  op {
    name: "m-matmul_4"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427465727
    user_conf {
      op_type_name: "matmul"
      input {
        key: "a"
        value {
          s: "m-flatten_3/out_0"
        }
      }
      input {
        key: "b"
        value {
          s: "m.dummy_buff/out"
        }
      }
      output {
        key: "out"
        value {
          s: "m-matmul_4/out_0"
        }
      }
      attr {
        key: "alpha"
        value {
          at_double: 1.0
        }
      }
      attr {
        key: "transpose_a"
        value {
          at_bool: false
        }
      }
      attr {
        key: "transpose_b"
        value {
          at_bool: false
        }
      }
    }
  }
  op {
    name: "output_0"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427461631
    output_conf {
      in: "m-matmul_4/out_0"
      out: "out"
      blob_conf {
        shape {
          dim: 6
          dim: 8
        }
        data_type: kFloat
        is_dynamic: false
        parallel_distribution {
          sbp_parallel {
            broadcast_parallel {
            }
          }
        }
      }
    }
  }
  op {
    name: "output_1"
    device_tag: "gpu"
    scope_symbol_id: 4611686018427461631
    output_conf {
      in: "m.layer.relu-relu_2/out_0"
      out: "out"
      blob_conf {
        shape {
          dim: 10
          dim: 10
        }
        data_type: kFloat
        is_dynamic: false
        parallel_distribution {
          sbp_parallel {
            broadcast_parallel {
            }
          }
        }
      }
    }
  }
}
placement {
  placement_group {
    op_set {
      op_name: "input_0"
      op_name: "input_1"
      op_name: "m.layer.weight"
      op_name: "m.layer-matmul_0"
      op_name: "m.layer.relu-relu_1"
      op_name: "m.layer.relu-relu_2"
      op_name: "m-flatten_3"
      op_name: "m.dummy_buff"
      op_name: "m-matmul_4"
      op_name: "output_0"
      op_name: "output_1"
    }
    parallel_conf {
      device_name: "0:0-3"
      device_tag: "gpu"
      hierarchy {
        dim: 4
      }
    }
  }
  blob_placement_group {
    lbi {
      op_name: "input_0"
      blob_name: "out"
    }
    lbi {
      op_name: "input_1"
      blob_name: "out"
    }
    lbi {
      op_name: "m.layer.weight"
      blob_name: "out"
    }
    lbi {
      op_name: "m.layer-matmul_0"
      blob_name: "out_0"
    }
    lbi {
      op_name: "m.layer.relu-relu_1"
      blob_name: "out_0"
    }
    lbi {
      op_name: "m.layer.relu-relu_2"
      blob_name: "out_0"
    }
    lbi {
      op_name: "m-flatten_3"
      blob_name: "out_0"
    }
    lbi {
      op_name: "m.dummy_buff"
      blob_name: "out"
    }
    lbi {
      op_name: "m-matmul_4"
      blob_name: "out_0"
    }
    lbi {
      op_name: "output_0"
      blob_name: "out"
    }
    lbi {
      op_name: "output_1"
      blob_name: "out"
    }
    parallel_conf {
      device_name: "0:0-3"
      device_tag: "gpu"
      hierarchy {
        dim: 4
      }
    }
  }
}
job_conf {
  job_name: "CustomGraph_0"
  predict_conf {
  }
}
job_parallel_view_conf {
}

@strint strint marked this pull request as ready for review July 16, 2021 08:26
@strint strint requested review from chengtbf and leaves-zwx July 16, 2021 08:26
@strint strint requested a review from oneflow-ci-bot July 21, 2021 09:44
@chengtbf chengtbf mentioned this pull request Jul 21, 2021
@strint strint requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 21, 2021 11:10
@github-actions
Copy link
Contributor

CI failed, removing label automerge

@oneflow-ci-bot oneflow-ci-bot removed their request for review July 21, 2021 12:17
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 140.4ms (= 4212.8ms / 30, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 126.1ms (= 3784.3ms / 30, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.11 (= 140.4ms / 126.1ms)

PyTorch resnet50 time: 85.1ms (= 2554.0ms / 30, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 73.6ms (= 2207.0ms / 30, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.16 (= 85.1ms / 73.6ms)

PyTorch resnet50 time: 62.4ms (= 1870.7ms / 30, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 48.9ms (= 1467.6ms / 30, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.27 (= 62.4ms / 48.9ms)

PyTorch resnet50 time: 49.1ms (= 1473.9ms / 30, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 46.2ms (= 1386.8ms / 30, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.06 (= 49.1ms / 46.2ms)

PyTorch resnet50 time: 43.3ms (= 1299.6ms / 30, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 47.3ms (= 1419.5ms / 30, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 0.92 (= 43.3ms / 47.3ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review July 21, 2021 15:11
@oneflow-ci-bot oneflow-ci-bot merged commit ab8aab8 into master Jul 21, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the fea/nn_graph/forward_graph branch July 21, 2021 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants