Skip to content

Add PSENet det model #290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 25, 2023
Merged

Add PSENet det model #290

merged 2 commits into from
May 25, 2023

Conversation

VictorHe-1
Copy link
Collaborator

add PSENet det model, where the best f-score is 0.82, compared with modelzoo 0.80, compared with paddleocr 0.82

@VictorHe-1 VictorHe-1 force-pushed the main branch 2 times, most recently from e5ce17e to 579ea7a Compare May 17, 2023 02:39
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在yaml同级目录下补充中英文README,可参考其他模型

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

return data


class DetResizeForTest(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace this operation with DetResize added in #294
and set the args in yaml config as follows, which should be equivalent to this DetResizeForTest op.

DetResize:
          target_limit_side: 1472
          limit_type: min

Please verify it by running the evaluation script

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already replaced and verified.

@VictorHe-1 VictorHe-1 force-pushed the main branch 4 times, most recently from 072ac8a to fa92f71 Compare May 24, 2023 07:51
shape_list[batch_idx])
result.append((np.array(boxes), np.array(scores).astype(np.float32)))

return result
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后处理的输出格式统一用字典,具体字段命名,请参考 #309

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,已更改

@VictorHe-1 VictorHe-1 force-pushed the main branch 3 times, most recently from c20fd61 to d5f7947 Compare May 24, 2023 08:50
@@ -0,0 +1,29 @@
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删去license信息

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

| **Model** | **Context** | **Backbone** | **Pretrained** | **Recall** | **Precision** | **F-score** | **Train T.** | **Throughput** | **Recipe** | **Download** |
|---------------------|----------------|---------------|------------|------------|---------------|-------------|--------------|-----------|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PSENet | D910x8-MS2.0-G | ResNet-152 | ImageNet | 79.39% | 84.83% | 82.02% | 138 s/epoch | 7.57 img/s | [yaml](pse_r152_icdar15.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/psenet/psenet_resnet152_ic15-6058a798.ckpt)
| PSENet (PaddleOCR) | - | ResNet50_vd | ImageNet | 79.53% | 85.81% | 82.55% | - | - | - | - |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删去paddleocr对比,其他模型文档的paddleocr对比后续也将删去。

| **模型** | **环境配置** | **骨干网络** | **预训练数据集** | **Recall** | **Precision** | **F-score** | **训练时间** | **吞吐量** | **配置文件** | **模型权重下载** |
|---------------------|----------------|---------------|------------|------------|---------------|-------------|--------------|-----------|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PSENet | D910x8-MS2.0-G | ResNet-152 | ImageNet | 79.39% | 84.83% | 82.02% | 138 s/epoch | 7.57 img/s | [yaml](pse_r152_icdar15.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/psenet/psenet_resnet152_ic15-6058a798.ckpt)
| PSENet (PaddleOCR) | - | ResNet50_vd | ImageNet | 79.53% | 85.81% | 82.55% | - | - | - | - |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删去paddleocr对比,其他模型文档的paddleocr对比后续也将删去。


| **模型** | **环境配置** | **骨干网络** | **预训练数据集** | **Recall** | **Precision** | **F-score** | **训练时间** | **吞吐量** | **配置文件** | **模型权重下载** |
|---------------------|----------------|---------------|------------|------------|---------------|-------------|--------------|-----------|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PSENet | D910x8-MS2.0-G | ResNet-152 | ImageNet | 79.39% | 84.83% | 82.02% | 138 s/epoch | 7.57 img/s | [yaml](pse_r152_icdar15.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/psenet/psenet_resnet152_ic15-6058a798.ckpt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

该PR合入后,导出mindir文件,并在此处更新链接。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的


| **Model** | **Context** | **Backbone** | **Pretrained** | **Recall** | **Precision** | **F-score** | **Train T.** | **Throughput** | **Recipe** | **Download** |
|---------------------|----------------|---------------|------------|------------|---------------|-------------|--------------|-----------|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PSENet | D910x8-MS2.0-G | ResNet-152 | ImageNet | 79.39% | 84.83% | 82.02% | 138 s/epoch | 7.57 img/s | [yaml](pse_r152_icdar15.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/psenet/psenet_resnet152_ic15-6058a798.ckpt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

该PR合入后,导出mindir文件,并在此处更新链接。

import mindspore as ms
from mindspore import Tensor
import mindspore.numpy as mnp

__all__ = ['L1BalancedCELoss']
__all__ = ['L1BalancedCELoss', 'DiceLoss', 'PSEDiceLoss']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果DiceLoss不需要外部调用,可以不出现在__all__,不对外暴露。

from .rec_loss import CTCLoss, AttentionLoss

__all__ = ['build_loss']

supported_losses = ['L1BalancedCELoss', 'CTCLoss', 'AttentionLoss']
supported_losses = ['L1BalancedCELoss', 'CTCLoss', 'AttentionLoss', 'DiceLoss', 'PSEDiceLoss']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果DiceLoss不需要外部调用,可以不出现在__all__,不对外暴露。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -1,10 +1,11 @@
import inspect
from .det_loss import L1BalancedCELoss
from .det_loss import L1BalancedCELoss, DiceLoss, PSEDiceLoss
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果DiceLoss不需要外部调用,可以不import,不对外暴露。

@@ -308,3 +308,92 @@ def expand_poly(poly, distance: float, joint_type=pyclipper.JT_ROUND) -> List[li
offset = pyclipper.PyclipperOffset()
offset.AddPath(poly, joint_type, pyclipper.ET_CLOSEDPOLYGON)
return offset.Execute(distance)


def dist(a, b):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dist(), perimeter(), shrink(),这三个函数如无需外部调用,建议放入类PSEGtDecode中。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postprocess/pse中的代码是否需要另外执行?如果需要,需在configs/det/psenet的README中加以说明。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要,好的

rect = cv2.minAreaRect(points)
bbox = cv2.boxPoints(rect)
else:
raise NotImplementedError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise NotImplementedError
raise NotImplementedError(f"The value of param 'box_type' can only be 'quad', but got '{self._box_type}'.")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已添加

else:
raise NotImplementedError(f"The value of param 'box_type' can only be 'quad', but got '{self._box_type}'.")

if 'polygons' in self._rescale_fields:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rescale_fields中的key应对应到输出结果{'polys': poly_list, 'scores': score_list}的key。因此,此处'polygons' 应改为 ’polys‘

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的, 已更改

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain the purpose for this compilation code here as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PSEPostProcess class needs this module and the pse module depends on the cpp codes.

Copy link
Collaborator

@SamitHuang SamitHuang May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不建议把C++编译放在__init__里。一是考虑用户不一定有C++编译环境,不用psenet的情况,没必要装或编译。 二是,每次启动或impore mindocr都会编译一次,体验不大好。
建议:放在PSEPostprocess类的__init__中,判断是否已编译该依赖,若无再编译。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
如上,会导致import失败。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. 可以在PSEPostprocess类里进行单独import pse模块,这样不用PSEPostprocess时,可以不装或者不编译。pse模块需要保证能够单独使用,即为一个可以正常独立工作的模块,因此还是需要在其__init__中增加c++编译,当前__init__代码会增加判断是否已经编译该模块,已编译情况会跳过编译。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

Copy link
Collaborator

@SamitHuang SamitHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: replace c++ based postprocessing with pure python code?

@SamitHuang SamitHuang merged commit 7f951ae into mindspore-lab:main May 25, 2023
jianyunchao pushed a commit to jianyunchao/mindocr that referenced this pull request May 25, 2023
Co-authored-by: Samit <285365963@qq.com>
jianyunchao added a commit to jianyunchao/mindocr that referenced this pull request May 25, 2023
Co-authored-by: Samit <285365963@qq.com>
jianyunchao added a commit to jianyunchao/mindocr that referenced this pull request May 25, 2023
Co-authored-by: Samit <285365963@qq.com>
Comment on lines 206 to 205
area = Polygon(bbox).area
peri = perimeter(bbox)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
area = Polygon(bbox).area
peri = perimeter(bbox)
poly = Polygon(bbox)
area, peri = poly.area, poly.length

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Comment on lines 209 to 211
pco = pyclipper.PyclipperOffset()
pco.AddPath(bbox, pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON)
offset = min((int)(area * (1 - rate) / (peri + 0.001) + 0.5), max_shr)

shrinked_bbox = pco.Execute(-offset) # (N, 2) shape, N maybe larger than or smaller than 4.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pco = pyclipper.PyclipperOffset()
pco.AddPath(bbox, pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON)
offset = min((int)(area * (1 - rate) / (peri + 0.001) + 0.5), max_shr)
shrinked_bbox = pco.Execute(-offset) # (N, 2) shape, N maybe larger than or smaller than 4.
offset = min((int)(area * (1 - rate) / (peri + 0.001) + 0.5), max_shr)
shrinked_bbox = expand_poly(bbox, -offset) # (N, 2) shape, N maybe larger than or smaller than 4.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accept


def __call__(self, data):
# img = deepcopy(data)
img = deepcopy(data['image'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
img = deepcopy(data['image'])

there's no need in deep copy, cv2.resize() does not change the original array.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed already

Comment on lines +86 to +100
self.reduce_conv_c2 = _conv(in_channels[0], out_channels, kernel_size=1, has_bias=True)
self.reduce_bn_c2 = _bn(out_channels)
self.reduce_relu_c2 = nn.ReLU()

self.reduce_conv_c3 = _conv(in_channels[1], out_channels, kernel_size=1, has_bias=True)
self.reduce_bn_c3 = _bn(out_channels)
self.reduce_relu_c3 = nn.ReLU()

self.reduce_conv_c4 = _conv(in_channels[2], out_channels, kernel_size=1, has_bias=True)
self.reduce_bn_c4 = _bn(out_channels)
self.reduce_relu_c4 = nn.ReLU()

self.reduce_conv_c5 = _conv(in_channels[3], out_channels, kernel_size=1, has_bias=True)
self.reduce_bn_c5 = _bn(out_channels)
self.reduce_relu_c5 = nn.ReLU()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.reduce = nn.CellList(
    [nn.SequentialCell([_conv(channels, out_channels, kernel_size=1, has_bias=True),
                        _bn(out_channels), nn.ReLU()])
     for channels in in_channels[::-1]]    # reverse order
)

Comment on lines +103 to +113
self.smooth_conv_p4 = _conv(out_channels, out_channels, kernel_size=3, has_bias=True)
self.smooth_bn_p4 = _bn(out_channels)
self.smooth_relu_p4 = nn.ReLU()

self.smooth_conv_p3 = _conv(out_channels, out_channels, kernel_size=3, has_bias=True)
self.smooth_bn_p3 = _bn(out_channels)
self.smooth_relu_p3 = nn.ReLU()

self.smooth_conv_p2 = _conv(out_channels, out_channels, kernel_size=3, has_bias=True)
self.smooth_bn_p2 = _bn(out_channels)
self.smooth_relu_p2 = nn.ReLU()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.smooth = nn.CellList(
    [nn.SequentialCell([_conv(out_channels, out_channels, kernel_size=3, has_bias=True),
                        _bn(out_channels), nn.ReLU()])
     for _ in range(3)]
)

p4 = self._resize_bilinear(p4, scale_factor=4)
p5 = self._resize_bilinear(p5, scale_factor=8)

out = self.concat((p2, p3, p4, p5))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

features = features[::-1]
out = [self.reduce[0](features[0])]

for reduce, smooth, feature in zip(self.reduce[1:], self.smooth, features[1:]):
    feature = reduce(feature)
    feature = self._resize_bilinear(out[-1], scale_factor=2) + feature
    out.append(smooth(feature))
    
# bilinear scale here with the `for` loop as well

out = self.concat(out[::-1])

I think this way is much cleaner and easier to read and understand the network architecture. But it will require you to retrain or convert your pretrained weights.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks better and cleaner. I'll change to this in the future when I have time to retrain the model.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to retrain the model, for simplicity you can just convert the weights (change the name suffix) and save them again.

Here's an example
from mindspore import load_checkpoint, load_param_into_net, save_checkpoint

from mindocr.models.backbones.mindcv_models import MobileNetV3


original_ckpt_path = 'mobilenet_v3_pp_large_050_best_new2.ckpt'
network = MobileNetV3(arch='large', alpha=0.5, scale_last=False, bottleneck_params={'se_version': 'SqueezeExciteV2', 'always_expand':  True})

params = load_checkpoint(original_ckpt_path)
new_params = {}

feat_id = 3
latest_fid = feat_id
stage_name = 'stages.stage0.0'

cache = {stage_name: feat_id}

for name, val in params.items():
    if 'stem_conv' in name:
        new_name = name.replace('stem_conv', 'features')
    elif name.startswith('classifier'):
        new_name = name
    elif name.startswith(stage_name):
        new_name = name.replace(stage_name, 'features.' + str(feat_id))
    else:
        stage_name = name[:15]
        if stage_name not in cache:
            latest_fid += 1
            cache[stage_name] = latest_fid
        feat_id = cache[stage_name]

        new_name = name.replace(stage_name, 'features.' + str(feat_id))
    new_params[new_name] = val

param_not_load, _ = load_param_into_net(network, new_params)

output_ckpt = ''
save_checkpoint(network, output_ckpt)

'''
batch_size = model_predict.shape[0]
model_predict = self.upsample(model_predict, scale_factor=4)
texts = self.slice(model_predict, (0, 0, 0, 0), (batch_size, 1, 640, 640))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not use fixed integers for shaping (i.e. 640x640). What if somebody will want to train on 512x512? Training will simply crash.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, working on this issue

pad_mode=pad_mode, weight_init=init_value, has_bias=has_bias)


def _bn(channels, momentum=0.1):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can move _conv and _bn functions inside the PSEFPN since they are PSEnet specific and can't be reused.

hadipash added a commit to hadipash/mindocr that referenced this pull request Jun 1, 2023
hadipash added a commit to hadipash/mindocr that referenced this pull request Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants