Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add model script, training recipe and pretrained weight of halonet #720

Merged
merged 15 commits into from
Aug 8, 2023

Conversation

rabbit-fgh
Copy link
Contributor

Thank you for your contribution to the MindCV repo.
Before submitting this PR, please make sure:

Motivation

(Write your motivation for proposed changes here.)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

## Introduction

​ Researchers from Google Research and UC Berkeley have developed a new model of self-attention that can outperform standard baseline models and even high-performance convolutional models.[[1](#references)]
​ Blocked Self-Attention:The whole input image is divided into multiple blocks and self-attention is applied to each block.However, if only the information inside the block is considered each time, it will inevitably lead to the loss of information.Therefore, before calculating the SA, a haloing operation is performed on each block, i.e., outside of each block, the information of the original image is used to padding a circle, so that the sensory field of each block can be appropriately larger and focus on more information.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

换行好像没起作用


| Model | Context | Top-1 (%) | Top-5 (%) | Params (M) | Recipe | Download |
| ---------- | -------- | --------- | --------- | ---------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| halonet50t | D910X8-G | 79.53 | 94.79 | 28.59 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/halonet/halonet_50t_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/halonet/halonet_50t-533da6be.ckpt) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

名字改成 halonet_50t


| Model | Context | Top-1 (%) | Top-5 (%) | Params (M) | Recipe | Download |
| ---------- | -------- | --------- | --------- | ---------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| halonet50t | D910X8-G | 79.53 | 94.79 | 28.59 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/halonet/halonet_50t_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/halonet/halonet_50t-533da6be.ckpt) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

模型参数是28.59吗?应该不是28.59. 看一下训练日志里面打印的模型参数信息

# lr scheduler
scheduler: 'warmup_cosine_decay'
min_lr: 0.000006
lr: 0.000234375
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lr好像不是这个吧。。0.00125?确认一下,其他参数也对照确认一下



default_cfgs = {
"halonet50t": _cfg(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该是halonet_50t吧。参考edgenext.py,_cfg里面加一下input_size=(3, 256, 256)

def halonet_50t(pretrained: bool = False, num_classes: int = 1000, in_channels=3, **kwargs):
"""Get HaloNet model.
Refer to the base class `models.HaloNet` for more details."""
default_cfg = default_cfgs["halonet"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

和上面default_cfgs里面的名字对齐

self.dim_out_qk,
1,
stride=self.block_stride,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这行空行是什么。。没用的话去掉

x = self.conv1(x)
x = self.conv2(x)
x = self.conv3(x)
x = ms.numpy.pad(x, ((0, 0), (0, 0), (1, 0), (1, 0)), constant_values=-32768)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里就用你之前提交的代码。。init里面初始化self.pad, construct里面调用。。不要有这些奇奇怪怪的数字

@@ -0,0 +1,81 @@
HaloNet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是标题,要大写的。看一下其他人的写法

@Songyuanwei Songyuanwei changed the title halonet pr feat: add model script, training recipe and pretrained weight of halonet [WIP] Aug 1, 2023
@Songyuanwei Songyuanwei changed the title feat: add model script, training recipe and pretrained weight of halonet [WIP] feat: add model script, training recipe and pretrained weight of halonet Aug 3, 2023
x = msnp.tensordot(q, rel_k, axes=1)
x = ops.reshape(x, (-1, W, rel_size))
# pad to shift from relative to absolute indexing
pad = ops.Pad(paddings=((0, 0), (0, 0), (0, 1)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个可以用函数式接口吗?

@geniuspatrick geniuspatrick merged commit 94ef850 into mindspore-lab:main Aug 8, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants