-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat graph compile progress bar #9537
Conversation
Speed stats:
|
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9537/ |
|
有道理 |
Speed stats:
|
return Maybe<void>::Ok(); | ||
} | ||
|
||
const static thread_local uint64_t progress_total_num = 60; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的 60 是人工数出来的吗?如果后续增加、删除某些阶段,这里是不是还得再改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的 60 是人工数出来的吗?如果后续增加、删除某些阶段,这里是不是还得再改
CostCounter 统计出来的。
是需要改,因为这里面 pass、plan、init, for 循环等,是不规则的逻辑,运行时才知道,编译时确定不了。
JUST(JobPass4Name("DumpBlobParallelConfPass")(job, &job_pass_ctx)); | ||
compile_tc->Count("[GraphCompile]" + job_name + " DumpBlobParallelConfPass", 1); | ||
compile_tc->Count("[GraphCompile]" + job_name + " DumpBlobParallelConfPass", 1, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有没有可能不需要展示所有的 pass,因为有些 pass 对于用户而言,即使看到了也不知道是在做什么。 对于用户关心的问题,应该是,当前在 Graph pass 图优化阶段(有几个标志性事件,比如 构图结束、 autograd 后向图展开、 amp、zero、mlir、Checkpointing、pipeline 等),当前在 物理图编译阶段(Compiler task node build、memory reuse),当前在 runtime init 阶段 等。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
当然这个可以作为 TODO 项
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不需要sleep。这里的目的不是看清楚,而是卡住的时候知道在运行哪一个
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
展示所有的 pass 是好事啊,一闪而过的很多显得编译很快啊,这是好事啊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不需要sleep。这里的目的不是看清楚,而是卡住的时候知道在运行哪一个
截图展示效果时用了 sleep,要合并的代码里面是没有的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
展示所有的 pass 是好事啊,一闪而过的很多显得编译很快啊,这是好事啊
好的
Speed stats:
|
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9537/ |
默认 graph 不输出信息,打开
debug(0)
或者ONEFLOW_NNGRAPH_ENABLE_PROGRESS_BAR
环境变量,会在 rank 0 显示编译进度条。或者
Feature request: #9217