-
Notifications
You must be signed in to change notification settings - Fork 60
Add PSENet det model #290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PSENet det model #290
Conversation
e5ce17e
to
579ea7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在yaml同级目录下补充中英文README,可参考其他模型
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
return data | ||
|
||
|
||
class DetResizeForTest(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace this operation with DetResize added in #294
and set the args in yaml config as follows, which should be equivalent to this DetResizeForTest op.
DetResize:
target_limit_side: 1472
limit_type: min
Please verify it by running the evaluation script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already replaced and verified.
072ac8a
to
fa92f71
Compare
shape_list[batch_idx]) | ||
result.append((np.array(boxes), np.array(scores).astype(np.float32))) | ||
|
||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后处理的输出格式统一用字典,具体字段命名,请参考 #309
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,已更改
c20fd61
to
d5f7947
Compare
mindocr/postprocess/pse/__init__.py
Outdated
@@ -0,0 +1,29 @@ | |||
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删去license信息
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
configs/det/psenet/README.md
Outdated
| **Model** | **Context** | **Backbone** | **Pretrained** | **Recall** | **Precision** | **F-score** | **Train T.** | **Throughput** | **Recipe** | **Download** | | ||
|---------------------|----------------|---------------|------------|------------|---------------|-------------|--------------|-----------|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| PSENet | D910x8-MS2.0-G | ResNet-152 | ImageNet | 79.39% | 84.83% | 82.02% | 138 s/epoch | 7.57 img/s | [yaml](pse_r152_icdar15.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/psenet/psenet_resnet152_ic15-6058a798.ckpt) | ||
| PSENet (PaddleOCR) | - | ResNet50_vd | ImageNet | 79.53% | 85.81% | 82.55% | - | - | - | - | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删去paddleocr对比,其他模型文档的paddleocr对比后续也将删去。
configs/det/psenet/README_CN.md
Outdated
| **模型** | **环境配置** | **骨干网络** | **预训练数据集** | **Recall** | **Precision** | **F-score** | **训练时间** | **吞吐量** | **配置文件** | **模型权重下载** | | ||
|---------------------|----------------|---------------|------------|------------|---------------|-------------|--------------|-----------|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| PSENet | D910x8-MS2.0-G | ResNet-152 | ImageNet | 79.39% | 84.83% | 82.02% | 138 s/epoch | 7.57 img/s | [yaml](pse_r152_icdar15.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/psenet/psenet_resnet152_ic15-6058a798.ckpt) | ||
| PSENet (PaddleOCR) | - | ResNet50_vd | ImageNet | 79.53% | 85.81% | 82.55% | - | - | - | - | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删去paddleocr对比,其他模型文档的paddleocr对比后续也将删去。
|
||
| **模型** | **环境配置** | **骨干网络** | **预训练数据集** | **Recall** | **Precision** | **F-score** | **训练时间** | **吞吐量** | **配置文件** | **模型权重下载** | | ||
|---------------------|----------------|---------------|------------|------------|---------------|-------------|--------------|-----------|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| PSENet | D910x8-MS2.0-G | ResNet-152 | ImageNet | 79.39% | 84.83% | 82.02% | 138 s/epoch | 7.57 img/s | [yaml](pse_r152_icdar15.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/psenet/psenet_resnet152_ic15-6058a798.ckpt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
该PR合入后,导出mindir文件,并在此处更新链接。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
|
||
| **Model** | **Context** | **Backbone** | **Pretrained** | **Recall** | **Precision** | **F-score** | **Train T.** | **Throughput** | **Recipe** | **Download** | | ||
|---------------------|----------------|---------------|------------|------------|---------------|-------------|--------------|-----------|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| PSENet | D910x8-MS2.0-G | ResNet-152 | ImageNet | 79.39% | 84.83% | 82.02% | 138 s/epoch | 7.57 img/s | [yaml](pse_r152_icdar15.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/psenet/psenet_resnet152_ic15-6058a798.ckpt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
该PR合入后,导出mindir文件,并在此处更新链接。
mindocr/losses/det_loss.py
Outdated
import mindspore as ms | ||
from mindspore import Tensor | ||
import mindspore.numpy as mnp | ||
|
||
__all__ = ['L1BalancedCELoss'] | ||
__all__ = ['L1BalancedCELoss', 'DiceLoss', 'PSEDiceLoss'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果DiceLoss不需要外部调用,可以不出现在__all__
,不对外暴露。
mindocr/losses/builder.py
Outdated
from .rec_loss import CTCLoss, AttentionLoss | ||
|
||
__all__ = ['build_loss'] | ||
|
||
supported_losses = ['L1BalancedCELoss', 'CTCLoss', 'AttentionLoss'] | ||
supported_losses = ['L1BalancedCELoss', 'CTCLoss', 'AttentionLoss', 'DiceLoss', 'PSEDiceLoss'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果DiceLoss不需要外部调用,可以不出现在__all__
,不对外暴露。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
mindocr/losses/builder.py
Outdated
@@ -1,10 +1,11 @@ | |||
import inspect | |||
from .det_loss import L1BalancedCELoss | |||
from .det_loss import L1BalancedCELoss, DiceLoss, PSEDiceLoss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果DiceLoss不需要外部调用,可以不import,不对外暴露。
@@ -308,3 +308,92 @@ def expand_poly(poly, distance: float, joint_type=pyclipper.JT_ROUND) -> List[li | |||
offset = pyclipper.PyclipperOffset() | |||
offset.AddPath(poly, joint_type, pyclipper.ET_CLOSEDPOLYGON) | |||
return offset.Execute(distance) | |||
|
|||
|
|||
def dist(a, b): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dist()
, perimeter()
, shrink()
,这三个函数如无需外部调用,建议放入类PSEGtDecode
中。
mindocr/postprocess/pse/README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
postprocess/pse
中的代码是否需要另外执行?如果需要,需在configs/det/psenet
的README中加以说明。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要,好的
rect = cv2.minAreaRect(points) | ||
bbox = cv2.boxPoints(rect) | ||
else: | ||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise NotImplementedError | |
raise NotImplementedError(f"The value of param 'box_type' can only be 'quad', but got '{self._box_type}'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已添加
else: | ||
raise NotImplementedError(f"The value of param 'box_type' can only be 'quad', but got '{self._box_type}'.") | ||
|
||
if 'polygons' in self._rescale_fields: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rescale_fields中的key应对应到输出结果{'polys': poly_list, 'scores': score_list}的key。因此,此处'polygons' 应改为 ’polys‘
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的, 已更改
mindocr/postprocess/pse/README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain the purpose for this compilation code here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PSEPostProcess class needs this module and the pse module depends on the cpp codes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不建议把C++编译放在__init__里。一是考虑用户不一定有C++编译环境,不用psenet的情况,没必要装或编译。 二是,每次启动或impore mindocr都会编译一次,体验不大好。
建议:放在PSEPostprocess类的__init__中,判断是否已编译该依赖,若无再编译。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. 可以在PSEPostprocess类里进行单独import pse模块,这样不用PSEPostprocess时,可以不装或者不编译。pse模块需要保证能够单独使用,即为一个可以正常独立工作的模块,因此还是需要在其__init__中增加c++编译,当前__init__代码会增加判断是否已经编译该模块,已编译情况会跳过编译。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: replace c++ based postprocessing with pure python code?
Co-authored-by: Samit <285365963@qq.com>
Co-authored-by: Samit <285365963@qq.com>
Co-authored-by: Samit <285365963@qq.com>
area = Polygon(bbox).area | ||
peri = perimeter(bbox) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
area = Polygon(bbox).area | |
peri = perimeter(bbox) | |
poly = Polygon(bbox) | |
area, peri = poly.area, poly.length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
pco = pyclipper.PyclipperOffset() | ||
pco.AddPath(bbox, pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON) | ||
offset = min((int)(area * (1 - rate) / (peri + 0.001) + 0.5), max_shr) | ||
|
||
shrinked_bbox = pco.Execute(-offset) # (N, 2) shape, N maybe larger than or smaller than 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pco = pyclipper.PyclipperOffset() | |
pco.AddPath(bbox, pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON) | |
offset = min((int)(area * (1 - rate) / (peri + 0.001) + 0.5), max_shr) | |
shrinked_bbox = pco.Execute(-offset) # (N, 2) shape, N maybe larger than or smaller than 4. | |
offset = min((int)(area * (1 - rate) / (peri + 0.001) + 0.5), max_shr) | |
shrinked_bbox = expand_poly(bbox, -offset) # (N, 2) shape, N maybe larger than or smaller than 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accept
|
||
def __call__(self, data): | ||
# img = deepcopy(data) | ||
img = deepcopy(data['image']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
img = deepcopy(data['image']) |
there's no need in deep copy, cv2.resize()
does not change the original array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed already
self.reduce_conv_c2 = _conv(in_channels[0], out_channels, kernel_size=1, has_bias=True) | ||
self.reduce_bn_c2 = _bn(out_channels) | ||
self.reduce_relu_c2 = nn.ReLU() | ||
|
||
self.reduce_conv_c3 = _conv(in_channels[1], out_channels, kernel_size=1, has_bias=True) | ||
self.reduce_bn_c3 = _bn(out_channels) | ||
self.reduce_relu_c3 = nn.ReLU() | ||
|
||
self.reduce_conv_c4 = _conv(in_channels[2], out_channels, kernel_size=1, has_bias=True) | ||
self.reduce_bn_c4 = _bn(out_channels) | ||
self.reduce_relu_c4 = nn.ReLU() | ||
|
||
self.reduce_conv_c5 = _conv(in_channels[3], out_channels, kernel_size=1, has_bias=True) | ||
self.reduce_bn_c5 = _bn(out_channels) | ||
self.reduce_relu_c5 = nn.ReLU() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.reduce = nn.CellList(
[nn.SequentialCell([_conv(channels, out_channels, kernel_size=1, has_bias=True),
_bn(out_channels), nn.ReLU()])
for channels in in_channels[::-1]] # reverse order
)
self.smooth_conv_p4 = _conv(out_channels, out_channels, kernel_size=3, has_bias=True) | ||
self.smooth_bn_p4 = _bn(out_channels) | ||
self.smooth_relu_p4 = nn.ReLU() | ||
|
||
self.smooth_conv_p3 = _conv(out_channels, out_channels, kernel_size=3, has_bias=True) | ||
self.smooth_bn_p3 = _bn(out_channels) | ||
self.smooth_relu_p3 = nn.ReLU() | ||
|
||
self.smooth_conv_p2 = _conv(out_channels, out_channels, kernel_size=3, has_bias=True) | ||
self.smooth_bn_p2 = _bn(out_channels) | ||
self.smooth_relu_p2 = nn.ReLU() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.smooth = nn.CellList(
[nn.SequentialCell([_conv(out_channels, out_channels, kernel_size=3, has_bias=True),
_bn(out_channels), nn.ReLU()])
for _ in range(3)]
)
p4 = self._resize_bilinear(p4, scale_factor=4) | ||
p5 = self._resize_bilinear(p5, scale_factor=8) | ||
|
||
out = self.concat((p2, p3, p4, p5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
features = features[::-1]
out = [self.reduce[0](features[0])]
for reduce, smooth, feature in zip(self.reduce[1:], self.smooth, features[1:]):
feature = reduce(feature)
feature = self._resize_bilinear(out[-1], scale_factor=2) + feature
out.append(smooth(feature))
# bilinear scale here with the `for` loop as well
out = self.concat(out[::-1])
I think this way is much cleaner and easier to read and understand the network architecture. But it will require you to retrain or convert your pretrained weights.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it looks better and cleaner. I'll change to this in the future when I have time to retrain the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to retrain the model, for simplicity you can just convert the weights (change the name suffix) and save them again.
Here's an example
from mindspore import load_checkpoint, load_param_into_net, save_checkpoint
from mindocr.models.backbones.mindcv_models import MobileNetV3
original_ckpt_path = 'mobilenet_v3_pp_large_050_best_new2.ckpt'
network = MobileNetV3(arch='large', alpha=0.5, scale_last=False, bottleneck_params={'se_version': 'SqueezeExciteV2', 'always_expand': True})
params = load_checkpoint(original_ckpt_path)
new_params = {}
feat_id = 3
latest_fid = feat_id
stage_name = 'stages.stage0.0'
cache = {stage_name: feat_id}
for name, val in params.items():
if 'stem_conv' in name:
new_name = name.replace('stem_conv', 'features')
elif name.startswith('classifier'):
new_name = name
elif name.startswith(stage_name):
new_name = name.replace(stage_name, 'features.' + str(feat_id))
else:
stage_name = name[:15]
if stage_name not in cache:
latest_fid += 1
cache[stage_name] = latest_fid
feat_id = cache[stage_name]
new_name = name.replace(stage_name, 'features.' + str(feat_id))
new_params[new_name] = val
param_not_load, _ = load_param_into_net(network, new_params)
output_ckpt = ''
save_checkpoint(network, output_ckpt)
''' | ||
batch_size = model_predict.shape[0] | ||
model_predict = self.upsample(model_predict, scale_factor=4) | ||
texts = self.slice(model_predict, (0, 0, 0, 0), (batch_size, 1, 640, 640)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not use fixed integers for shaping (i.e. 640x640). What if somebody will want to train on 512x512? Training will simply crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, working on this issue
pad_mode=pad_mode, weight_init=init_value, has_bias=has_bias) | ||
|
||
|
||
def _bn(channels, momentum=0.1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can move _conv
and _bn
functions inside the PSEFPN
since they are PSEnet specific and can't be reused.
add PSENet det model, where the best f-score is 0.82, compared with modelzoo 0.80, compared with paddleocr 0.82