Skip to content

one-yolov5/classify/train.py 脚本 nsys 报告 【2023-03-29】 #123

Open
@ccssu

Description

@ccssu

引言

对 one-yolov5/classify/train.py 跑了两份 nsys 报告 .

one-yolo_profile:
03-29-07-10profile.zip

torch-yolo_profile:
torch_03-29-08-37profile.zip

one-yolo 测试结果

tloss = (tloss * i + loss.item()) / (i + 1) # update mean losses

one-yolo torch-yolo
tloss这一行耗时 99ms 14ms

注意:

  • flow.version='0.9.1.dev20230327+cu117'
  • torch.version='1.13.0+cu117'
  • 均使用 float32训练·。
  • 启动指令均使用batch-size=256 , epochs = 6 , model = yolov5s-cls 模型
  • 机器 a100

结论:nsys分析看 tloss 这一行速度比较明显低于torch-yolo。如果优化速度将得到极大提升。

one-yolov5项目相关数据

项目地址: https://github.com/Oneflow-Inc/one-yolov5
数据集路径: @oneflow-25:/data/home/fengwen/imagenette160
权重路径: @oneflow-25:/data/home/fengwen/weight_v1_2_0

如果执行nsys产生报错
The target application terminated. One or more process it created re-parented.
Waiting for termination of re-parented processes.
Use the `--wait` option to modify this behavior.

请将 train.py中 check_git_status() 这一行注释

one-yolo 详细测试数据

one-yolov5启动指令
DATESTR=$(date +"%m-%d-%H-%M")
cd  ~/one-yolov5 
set -e 
# py-spy record -o profile.svg --native --
run_cmd="/usr/local/cuda/bin/nsys   profile -o runs/${DATESTR}profile python  \
    classify/train.py \
    --model runs/yolov5s-cls.pt \
    --data ../datasets/imagenette160   \
    --img 224  \
    --batch 256 \
    --epochs 6 \
    --project  One-YOLOv5_v_1_2_0_train \
    --name yolov5n-default \
    --multi_tensor_optimizer \
    --name yolov5n-default --lr0 0.1 --optimizer SGD "

echo ${run_cmd}
eval ${run_cmd}

one-yolo_profile
03-29-07-10profile.zip

image

torch-yolo_profile
torch_03-29-08-37profile.zip
image

修复方案

努力加载中。。。

资料集

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions