Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 5th No.69】 分类大模型--人体视觉任务SOLIDER #2995

Merged
merged 11 commits into from
Oct 18, 2023

Conversation

Yang-Changhui
Copy link
Contributor

@Yang-Changhui Yang-Changhui commented Oct 8, 2023

swin_tiny_patch4_window7_224:
image
swin_small_patch4_window7_224:
image
swin_base_patch4_window7_224:
image
swin_small_patch4_window7_224的误差稍大,问题在最后一个nn.LayerNorm,权重一样,但是可能精度不一样,导致误差大于1e-6

原始代码只修改了 SwinTransformerBlock,<=修改为<;尝试在swin_transformer-variant中修改,发现修改后不起作用,因此该部分在原始代码中修改
image

@paddle-bot
Copy link

paddle-bot bot commented Oct 8, 2023

Thanks for your contribution!

@cuicheng01
Copy link
Collaborator

原始代码只修改了 SwinTransformerBlock,<=修改为<;尝试在swin_transformer-variant中修改,发现修改后不起作用,因此该部分在原始代码中修改 image

此处为什么需要修改呢

@Yang-Changhui
Copy link
Contributor Author

原始代码只修改了 SwinTransformerBlock,<=修改为<;尝试在swin_transformer-variant中修改,发现修改后不起作用,因此该部分在原始代码中修改 image

此处为什么需要修改呢

不修改的话,在最后一个调用SwinTransformerBlock时,min(self.input_resolution)=self.window_size,导致attn_mask = None;但在solider中,最后一次attn_mask 不为None;修改后,保持一致

@Yang-Changhui
Copy link
Contributor Author

@cuicheng01 你好,请问还需要做出那些修改吗

@cuicheng01
Copy link
Collaborator

@cuicheng01 你好,请问还需要做出那些修改吗

这里还是建议在 variant中修改哈,避免对原始网络造成可能的错误侵入

@Yang-Changhui
Copy link
Contributor Author

@cuicheng01 你好,请问还需要做出那些修改吗

这里还是建议在 variant中修改哈,避免对原始网络造成可能的错误侵入
如果这样的话,可能导致写的variant比较冗余,这样也没问题吗

@Yang-Changhui
Copy link
Contributor Author

@cuicheng01 你好,我能否将原代码中的这部分判断条件,修改成函数,这样方便继承,而且对原代码没有影响,如下所示:
image

@Yang-Changhui
Copy link
Contributor Author

@cuicheng01 你好,你有时间review一下吗,是否还有什么需要修改

@shiyutang
Copy link
Collaborator

你好,是否能提供验证对齐的记录? 包括三个模型的前向对齐结果。

@Yang-Changhui
Copy link
Contributor Author

Yang-Changhui commented Oct 12, 2023

@shiyutang 你好,这是三个模型的对齐记录,链接说明:#2992
solider_base_forward_diff.log
solider_small_forward_diff.log
solider_tiny_forward_diff.log

Comment on lines 21 to 32
| Person Re-identification (mAP/R1) w/o re-ranking | Market1501 | 91.6/96.1 | 93.3/96.6 | 93.9/96.9 |
| | MSMT17 | 67.4/85.9 | 76.9/90.8 | 77.1/90.7 |
| Person Re-identification (mAP/R1) with re-ranking | Market1501 | 95.3/96.6 | 95.4/96.4 | 95.6/96.7 |
| | MSMT17 | 81.5/89.2 | 86.5/91.7 | 86.5/91.7 |
| Attribute Recognition (mA) | PETA_ZS | 74.37 | 76.21 | 76.43 |
| | RAP_ZS | 74.23 | 75.95 | 76.42 |
| | PA100K | 84.14 | 86.25 | 86.37 |
| Person Search (mAP/R1) | CUHK-SYSU | 94.9/95.7 | 95.5/95.8 | 94.9/95.5 |
| | PRW | 56.8/86.8 | 59.8/86.7 | 59.7/86.8 |
| Pedestrian Detection (MR-2) | CityPersons | 10.3/40.8 | 10.0/39.2 | 9.7/39.4 |
| Human Parsing (mIOU) | LIP | 57.52 | 60.21 | 60.50 |
| Pose Estimation (AP/AR) | COCO | 74.4/79.6 | 76.3/81.3 | 76.6/81.5 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些是用复现的模型验证的么,如果不是,建议删除,补上对齐的log

Comment on lines 15 to 16
# TODO (littletomatodonkey), uncomment the line will cause failure of jit.save
# assert [H, W] == self.img_size[:2], "Input image size ({H}*{W}) doesn't match model ({}*{}).".format(H, W, self.img_size[0], self.img_size[1])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不必要注释可以删除

Copy link
Collaborator

@shiyutang shiyutang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swin transdformer - small 无法下载 image

@Yang-Changhui
Copy link
Contributor Author

@shiyutang 你好,已更新

import numpy as np
import paddle
import paddle.nn as nn
from ppcls.arch.backbone.legendary_models.swin_transformer import SwinTransformer, _load_pretrained, \
Copy link
Collaborator

@shiyutang shiyutang Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这部分导出代码会导致CI报错,辛苦参照这个修改为相对路径导出:https://github.com/PaddlePaddle/PaddleClas/blob/develop/ppcls/arch/backbone/variant_models/pp_lcnet_variant.py#L4

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修改后需要如下指令验证

cd PaddleClas
cd deploy
python python/predict_cls.py -c configs/inference_cls.yaml


| model | weight | log |
| ----------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| swin_tiny_patch4_window7_224 | 链接:https://pan.baidu.com/s/1QdUviOSW2RdS3UGGxxHEAA?pwd=qcdd <br/>提取码:qcdd | 链接:https://pan.baidu.com/s/1W5zUFboMMhXETy4HEWbM3Q?pwd=45nx <br/>提取码:45nx |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yang-Changhui
Copy link
Contributor Author

@shiyutang 你好,已经都修改好了,并且也验证过了

shiyutang
shiyutang previously approved these changes Oct 13, 2023
Copy link
Collaborator

@shiyutang shiyutang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

# if window size is larger than input resolution, we don't partition windows
self.shift_size = 0
self.window_size = min(self.input_resolution)
assert 0 <= self.shift_size < self.window_size, "shift_size must in 0-window_size"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

类之间空两行

use_ssld=False,
use_imagenet22k_pretrained=False,
use_imagenet22kto1k_pretrained=False,
**kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use_ssld,use_imagenet22k_pretrained,use_imagenet22kto1k_pretrained等参数无需暴露

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

你好,这三个参数暴露是参考swin_transformer中的参数设定,因为_load_pretrained需要传入这三个参数,那请问是需要直接继承swin_transformer中的swin_tiny_patch4_window7_224这些类吗

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_load_pretrained可以通过**kwargs控制,然后这三个参数可以不暴露

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是可以这样,那么我需要将这三个参数的初始化写入yml配置文件中吗,现在的配置文件并没有这三个参数

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请问需要我把swin_transformer原代码也改成这种形式吗,而且pretrained=False也可以不暴露,由**kwargs控制

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

源代码无需改变哈,只改变种的代码就可以

@cuicheng01
Copy link
Collaborator

麻烦再提供一下验证的代码,包括torch和的paddle的,提供网盘链接就好,验证通过后,此PR可以合入

@Yang-Changhui
Copy link
Contributor Author

链接:https://pan.baidu.com/s/1SFxcaklLGAAki6AjZf_lSQ?pwd=sk36
提取码:sk36

PatchEmbed, BasicLayer, SwinTransformerBlock

MODEL_URLS_SOLIDER = {
"SwinTransformer_tiny_patch4_window7_224_SOLIDER":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注意到这个链接名字不太对,我已将SOILDER改为SOLIDER,麻烦更改一下吧~

## 2. 对齐日志、模型

| model | weight | log |
| ----------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注意到这个链接名字不太对,我已将SOILDER改为SOLIDER,麻烦更改一下吧~

@@ -91,6 +91,9 @@
from .variant_models.pp_lcnetv2_variant import PPLCNetV2_base_ShiTu
from .variant_models.efficientnet_variant import EfficientNetB3_watermark
from .variant_models.foundation_vit_variant import CLIP_large_patch14_224_aesthetic
from .variant_models.swin_transformer_variant import SwinTransformer_tiny_patch4_window7_224_SOLIDER
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个一次全部import吧

ppcls/arch/backbone/variant_models/__init__.py Outdated Show resolved Hide resolved
@Yang-Changhui
Copy link
Contributor Author

@cuicheng01 你好,已经修改完毕,麻烦有时间review下

@cuicheng01 cuicheng01 merged commit c446df9 into PaddlePaddle:develop Oct 18, 2023
2 checks passed
psky1111 pushed a commit to psky1111/PaddleClas that referenced this pull request Oct 27, 2023
* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider
psky1111 added a commit to psky1111/PaddleClas that referenced this pull request Oct 27, 2023
fix time cost problem

Update swin_transformer.py

fix the speed and memory problem

reduce the unnecessary calculation when patch matches resolution

fix conflict

remove check resolution function

Revert "fix conflict"

This reverts commit d7a7dad.

fix conflict

remove the conflict checkpoint function

【Hackathon 5th No.69】 分类大模型--人体视觉任务SOLIDER (PaddlePaddle#2995)

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

update doc about PPHGNetV2 (PaddlePaddle#3002)

fix clip patch embedding resolution problem

support non 224 resolution

integrate the pading function to one

adjust function name

fix the resolution problem for clip-vision transformer part and swim transformer

fix the resolution problem for clip-vision transformer part and swim transformer
psky1111 added a commit to psky1111/PaddleClas that referenced this pull request Oct 27, 2023
Update .gitignore

fix time cost problem

Update swin_transformer.py

fix the speed and memory problem

reduce the unnecessary calculation when patch matches resolution

fix conflict

remove check resolution function

Revert "fix conflict"

This reverts commit d7a7dad.

fix conflict

remove the conflict checkpoint function

【Hackathon 5th No.69】 分类大模型--人体视觉任务SOLIDER (PaddlePaddle#2995)

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

update doc about PPHGNetV2 (PaddlePaddle#3002)

fix clip patch embedding resolution problem

support non 224 resolution

integrate the pading function to one

adjust function name

fix the resolution problem for clip-vision transformer part and swim transformer

fix the resolution problem for clip-vision transformer part and swim transformer
cuicheng01 pushed a commit that referenced this pull request Oct 31, 2023
* Update foundation_vit.py

Update .gitignore

fix time cost problem

Update swin_transformer.py

fix the speed and memory problem

reduce the unnecessary calculation when patch matches resolution

fix conflict

remove check resolution function

Revert "fix conflict"

This reverts commit d7a7dad.

fix conflict

remove the conflict checkpoint function

【Hackathon 5th No.69】 分类大模型--人体视觉任务SOLIDER (#2995)

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

* add_solider

update doc about PPHGNetV2 (#3002)

fix clip patch embedding resolution problem

support non 224 resolution

integrate the pading function to one

adjust function name

fix the resolution problem for clip-vision transformer part and swim transformer

fix the resolution problem for clip-vision transformer part and swim transformer

* fix cache problem

using the huggingface plan and drop the cache

* Revert "fix cache problem"

This reverts commit 8f7ab55.

* fix resolution problem

* update big model backbone

* Revert "update big model backbone"

This reverts commit 04a39f7.
@shiyutang shiyutang added the Contributor PR is merged add this when a contributor's PR is merged. label Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Contributor PR is merged add this when a contributor's PR is merged. contributor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants