Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 5th No.58】 A physics-informed deep neural network for surrogate modeling in classical elasto-plasticity #606

Merged
merged 2 commits into from
Nov 13, 2023

Conversation

co63oc
Copy link
Contributor

@co63oc co63oc commented Oct 26, 2023

PR types

Others

PR changes

Others

Describe

迁移原PR #558
修改:
Data类结构修改
取消global
Solver plot_loss_history使用的数据为一项,多项数据画图不支持,所以没有使用plot_loss_history
取消 irepeat
数据集读取错误,增加num_workers=0配置
使用hydra

PaddlePaddle/Paddle#57262

训练精度

total $EPNN^e$ ${EPNN^{e}}^{p}$ $EPNN^\sigma$
paper 3.71 0.58 2.99 0.14
ppsci 4.07 0.75 3.16 0.14
diff 9% 29% 22% 0%

torch
image

paddle
image

数据集在 https://github.com/meghbali/ANNElastoplasticity/tree/main/Datasets/WG
dstate-16-plas.dat
dstress-16-plas.dat

已修改:
Data类结构修改
取消global
取消 irepeat
数据集读取错误,增加num_workers=0配置
使用hydra

@paddle-bot
Copy link

paddle-bot bot commented Oct 26, 2023

Thanks for your contribution!

examples/epnn/epnn.py Show resolved Hide resolved
examples/epnn/epnn.py Show resolved Hide resolved
@lijialin03
Copy link
Contributor

代码可以有evaluation的补充吗?如果没有的话md文件里需要改一下,因为yaml文件里没有EVAL这一项

@co63oc
Copy link
Contributor Author

co63oc commented Oct 30, 2023

代码可以有evaluation的补充吗?如果没有的话md文件里需要改一下,因为yaml文件里没有EVAL这一项

设置了eval_during_train=True,单独evaluation没有作用,已修改md文件

docs/zh/examples/epnn.md Show resolved Hide resolved
examples/epnn/conf/epnn.yaml Show resolved Hide resolved
docs/zh/examples/epnn.md Outdated Show resolved Hide resolved
docs/zh/examples/epnn.md Outdated Show resolved Hide resolved
docs/zh/examples/epnn.md Outdated Show resolved Hide resolved
docs/zh/examples/epnn.md Outdated Show resolved Hide resolved
docs/zh/examples/epnn.md Outdated Show resolved Hide resolved
docs/zh/examples/epnn.md Outdated Show resolved Hide resolved
docs/zh/examples/epnn.md Outdated Show resolved Hide resolved
docs/zh/examples/epnn.md Outdated Show resolved Hide resolved
docs/zh/examples/epnn.md Outdated Show resolved Hide resolved
docs/zh/examples/epnn.md Show resolved Hide resolved
@lijialin03
Copy link
Contributor

辛苦~

合入之前还有一些需要补充和修改的地方:

  1. 代码中的变量,不管是不是中间变量,都需要使用意义明确的名字,不要使用xx11,xx12这样的名字,以增加代码可读性
  2. loss分开显示(即需要分开设置constraint),同时明确每个loss代表的意义
  3. eval相关的部分挪到evalution中
  4. 对照原论文,画图时需要明确曲线含义,修改图片/变量名字
  5. 参照官网标准,从原代码/论文提供一些评估指标(数值),并与复现的结果进行对比(注意写在comment里,不要写进epnn.md文档),在代码进行ci测试时需要符合这些数值

谢谢

@co63oc
Copy link
Contributor Author

co63oc commented Oct 31, 2023

合入之前还有一些需要补充和修改的地方:

不大理解,模型复现总的过程就是模型+模型训练+收集数据画图,现在的问题是不理解收集哪部分数据

1 变量名已修改
2 论文中有两种loss算法,一种是loss,另一种是error,train使用的是loss和error,eval使用的是error,现在结果图是三张子图,分别是训练loss,训练的error,验证集计算的error,loss分开显示是三张子图分开生成吗,还是要去掉其中部分图?
3 eval是在每次train后调用,这样用来计算loss和error,如果移动到evalution,那是只用预训练模型获取结果的数据吗
4 论文中名称是这样,一种是loss,另一种是error,train使用的是loss,eval使用的是error
5 是PR的comment吗,还是代码的comment,PR的comment已设置指标

@lijialin03
Copy link
Contributor

lijialin03 commented Nov 1, 2023

合入之前还有一些需要补充和修改的地方:

不大理解,模型复现总的过程就是模型+模型训练+收集数据画图,现在的问题是不理解收集哪部分数据

  1. 需要把三种loss/error分开输出,按照这种输出格式,这样跟论文的图也对的上
    78E13F34A116FB9D078DF35A9B6A6A70
  2. 对的,那就是说现在的不用改,需要再在evalution中额外增加调用预训练模型,生成error的代码,可以参考bracket
  3. 是的,我作为reviewer我已经理解了,但是一个新的用户在不看我们的沟通记录的时候可能很难理解,所以需要明确含义,比如对比论文里这个图,就能明确的看出是分别在训练和验证集上的,sigma/epsilon/xx/的结果
0f7948648b82aaa1aef6b14c713d2d77 5. 我理解这个PR一开始的comment里的指标是跑100个epoch后的loss,是反向对齐的结果。但是首先反向对齐不能当成最终指标,它只是复现对齐中必要的一步,其次loss也不是指标,因为随机性的东西哪怕固定了随机种子,也可能和其他的因素有关(比如cuda版本),可以看一下这个https://pytorch.org/docs/stable/notes/randomness.html。因此一般的指标是诸如预测结果与真实值之间的 l2_error 等等(需要看论文确定)

@co63oc
Copy link
Contributor Author

co63oc commented Nov 1, 2023

合入之前还有一些需要补充和修改的地方:

不大理解,模型复现总的过程就是模型+模型训练+收集数据画图,现在的问题是不理解收集哪部分数据

  1. 需要把三种loss/error分开输出,按照这种输出格式,这样跟论文的图也对的上
    78E13F34A116FB9D078DF35A9B6A6A70
  2. 对的,那就是说现在的不用改,需要再在evalution中额外增加调用预训练模型,生成error的代码,可以参考bracket
  3. 是的,我作为reviewer我已经理解了,但是一个新的用户在不看我们的沟通记录的时候可能很难理解,所以需要明确含义,比如对比论文里这个图,就能明确的看出是分别在训练和验证集上的,sigma/epsilon/xx/的结果

0f7948648b82aaa1aef6b14c713d2d77 5. 我理解这个PR一开始的comment里的指标是跑100个epoch后的loss,是反向对齐的结果。但是首先反向对齐不能当成最终指标,它只是复现对齐中必要的一步,其次loss也不是指标,因为随机性的东西哪怕固定了随机种子,也可能和其他的因素有关(比如cuda版本),可以看一下这个https://pytorch.org/docs/stable/notes/randomness.html。因此一般的指标是诸如预测结果与真实值之间的 l2_error 等等(需要看论文确定)

  1. loss function只能使用一个值,修改拆分的值为print输出显示
    3 已增加evalution代码
    4 已修改图片字段信息
    5 已修改显示Error值
    image
    metric计算的输入参数output_dict会只取一项,修改为取消 metric

image

输出图像接近论文图25(epoch是1e^6)
image

Copy link
Contributor

@lijialin03 lijialin03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要将指标量化一下,如 Eval error:

total $EPNN^e$ ${EPNN^{e}}^{p}$ $EPNN^\sigma$
paper xx xx xx xx
ppsci xx xx xx xx
diff xx% xx% xx% xx%

所以需要将原代码运行一下,记录相关指标,辛苦

examples/epnn/functions.py Show resolved Hide resolved
examples/epnn/functions.py Outdated Show resolved Hide resolved
examples/epnn/functions.py Outdated Show resolved Hide resolved
examples/epnn/functions.py Outdated Show resolved Hide resolved
examples/epnn/functions.py Outdated Show resolved Hide resolved
examples/epnn/functions.py Outdated Show resolved Hide resolved
examples/epnn/functions.py Outdated Show resolved Hide resolved
examples/epnn/functions.py Outdated Show resolved Hide resolved
examples/epnn/epnn.py Outdated Show resolved Hide resolved
examples/epnn/functions.py Outdated Show resolved Hide resolved
@co63oc
Copy link
Contributor Author

co63oc commented Nov 2, 2023

运行指标

total $EPNN^e$ ${EPNN^{e}}^{p}$ $EPNN^\sigma$
paper 3.71 0.58 2.99 0.14
ppsci 4.07 0.75 3.16 0.14
diff 9% 29% 5% 0%

原仓库有的数据类型为float64

@lijialin03
Copy link
Contributor

Contributor

好的,辛苦,不过22%这个算错了,应该是5%左右吧

@co63oc
Copy link
Contributor Author

co63oc commented Nov 3, 2023

已修改

Copy link
Collaborator

@HydrogenSulfate HydrogenSulfate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: 修改、完善代码和文档

docs/zh/examples/epnn.md Show resolved Hide resolved
docs/zh/examples/epnn.md Show resolved Hide resolved
ppsci/arch/epnn.py Show resolved Hide resolved
@HydrogenSulfate HydrogenSulfate merged commit 27aceee into PaddlePaddle:develop Nov 13, 2023
3 checks passed
@co63oc co63oc mentioned this pull request Nov 13, 2023
@co63oc
Copy link
Contributor Author

co63oc commented Nov 13, 2023

TODO: 修改、完善代码和文档

修改PR #636

huohuohuohuohuo123 pushed a commit to huohuohuohuohuo123/PaddleScience that referenced this pull request Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants