Skip to content

Data pipeline refactoring #300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .flake8
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ per-file-ignores =
tools/infer/text/predict_rec.py:E402
tools/dataset_converters/convert.py:F401,F403
deploy/models_utils/auto_scaling/converter.py:E402
mindocr/data/transforms/transforms_factory.py:F401,F403
mindocr/data/transforms/transforms_factory.py:F401,F403,F405
tests/*:E402
tests/ut/dataset_convert/test_dataset_converter.py:E402
tests/ut/test_datasets.py:F401,E402
49 changes: 19 additions & 30 deletions configs/det/dbnet/db_mobilenetv3_icdar15.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ model:

postprocess:
name: DBPostprocess
box_type: quad # whether to output a polygon or a box
box_type: quad # whether to output a polygon or a box
binary_thresh: 0.3 # binarization threshold
box_thresh: 0.6 # box score threshold
max_candidates: 1000
Expand Down Expand Up @@ -79,13 +79,8 @@ train:
label_file: ic15/det/train/det_gt.txt
sample_ratio: 1.0
transform_pipeline:
- DecodeImage:
img_mode: RGB
to_float32: False
- Decode:
- DetLabelEncode:
- RandomColorAdjust:
brightness: 0.1255 # 32.0 / 255
saturation: 0.5
- RandomHorizontalFlip:
p: 0.5
- RandomRotate:
Expand All @@ -108,15 +103,15 @@ train:
shrink_ratio: 0.4
thresh_min: 0.3
thresh_max: 0.7
- NormalizeImage:
bgr_to_rgb: False
is_hwc: True
mean: imagenet
std: imagenet
- ToCHWImage:
- RandomColorAdjust:
brightness: 0.1255 # 32.0 / 255
saturation: 0.5
- Normalize:
mean: &mean [ 123.675, 116.28, 103.53 ]
std: &std [ 58.395, 57.12, 57.375 ]
- HWC2CHW:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visualize
output_columns: [ 'image', 'binary_map', 'mask', 'thresh_map', 'thresh_mask']
# output_columns: ['image'] # for debug op performance
output_columns: [ 'image', 'binary_map', 'mask', 'thresh_map', 'thresh_mask' ]
net_input_column_index: [0] # input indices for network forward func in output_columns
label_column_index: [1, 2, 3, 4] # input indices marked as label

Expand All @@ -136,22 +131,16 @@ eval:
label_file: ic15/det/test/det_gt.txt
sample_ratio: 1.0
transform_pipeline:
- DecodeImage:
img_mode: RGB
to_float32: False
- Decode:
- DetLabelEncode:
- GridResize:
factor: 32
# GridResize already sets the evaluation size to [ 736, 1280 ].
# Uncomment ScalePadImage block for other resolutions.
# - ScalePadImage:
# target_size: [ 736, 1280 ] # h, w
- NormalizeImage:
bgr_to_rgb: False
is_hwc: True
mean: imagenet
std: imagenet
- ToCHWImage:
- DetResize:
target_size: [ 736, 1280 ]
keep_ratio: False
force_divisible: True # GridResize 32
- Normalize:
mean: *mean
std: *std
- HWC2CHW:
# the order of the dataloader list, matching the network input and the labels for evaluation
output_columns: [ 'image', 'polys', 'ignore_tags', 'shape_list' ]
net_input_column_index: [0] # input indices for network forward func in output_columns
Expand Down
49 changes: 21 additions & 28 deletions configs/det/dbnet/db_r18_icdar15.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ model:

postprocess:
name: DBPostprocess
box_type: quad # whether to output a polygon or a box
box_type: quad # whether to output a polygon or a box
binary_thresh: 0.3 # binarization threshold
box_thresh: 0.6 # box score threshold
max_candidates: 1000
Expand Down Expand Up @@ -68,18 +68,14 @@ train:
dataset_sink_mode: True
dataset:
type: DetDataset
mindrecord: True
dataset_root: /data/ocr_datasets
data_dir: ic15/det/train/ch4_training_images
label_file: ic15/det/train/det_gt.txt
data_dir: ic15/det/MR/train
label_file: ''
sample_ratio: 1.0
transform_pipeline:
- DecodeImage:
img_mode: RGB
to_float32: False
- Decode:
- DetLabelEncode:
- RandomColorAdjust:
brightness: 0.1255 # 32.0 / 255
saturation: 0.5
- RandomHorizontalFlip:
p: 0.5
- RandomRotate:
Expand All @@ -102,15 +98,15 @@ train:
shrink_ratio: 0.4
thresh_min: 0.3
thresh_max: 0.7
- NormalizeImage:
bgr_to_rgb: False
is_hwc: True
mean: imagenet
std: imagenet
- ToCHWImage:
- RandomColorAdjust:
brightness: 0.1255 # 32.0 / 255
saturation: 0.5
- Normalize:
mean: &mean [ 123.675, 116.28, 103.53 ]
std: &std [ 58.395, 57.12, 57.375 ]
- HWC2CHW:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visualize
output_columns: [ 'image', 'binary_map', 'mask', 'thresh_map', 'thresh_mask' ] #'img_path']
# output_columns: ['image'] # for debug op performance
output_columns: [ 'image', 'binary_map', 'mask', 'thresh_map', 'thresh_mask' ]
net_input_column_index: [0] # input indices for network forward func in output_columns
label_column_index: [1, 2, 3, 4] # input indices marked as label

Expand All @@ -125,24 +121,21 @@ eval:
dataset_sink_mode: False
dataset:
type: DetDataset
mindrecord: True
dataset_root: /data/ocr_datasets
data_dir: ic15/det/test/ch4_test_images
label_file: ic15/det/test/det_gt.txt
data_dir: ic15/det/MR/test
label_file: ''
sample_ratio: 1.0
transform_pipeline:
- DecodeImage:
img_mode: RGB
to_float32: False
- Decode:
- DetLabelEncode:
- DetResize:
target_size: [ 736, 1280 ]
keep_ratio: False
- NormalizeImage:
bgr_to_rgb: False
is_hwc: True
mean: imagenet
std: imagenet
- ToCHWImage:
- Normalize:
mean: *mean
std: *std
- HWC2CHW:
# the order of the dataloader list, matching the network input and the labels for evaluation
output_columns: [ 'image', 'polys', 'ignore_tags', 'shape_list']
net_input_column_index: [0] # input indices for network forward func in output_columns
Expand Down
57 changes: 22 additions & 35 deletions configs/det/dbnet/db_r50_icdar15.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ model:

postprocess:
name: DBPostprocess
box_type: quad # whether to output a polygon or a box
box_type: quad # whether to output a polygon or a box
binary_thresh: 0.3 # binarization threshold
box_thresh: 0.6 # box score threshold
max_candidates: 1000
Expand Down Expand Up @@ -67,18 +67,14 @@ train:
dataset_sink_mode: True
dataset:
type: DetDataset
mindrecord: True
dataset_root: /data/ocr_datasets
data_dir: ic15/det/train/ch4_training_images
label_file: ic15/det/train/det_gt.txt
data_dir: ic15/det/MR/train
label_file: ''
sample_ratio: 1.0
transform_pipeline:
- DecodeImage:
img_mode: RGB
to_float32: False
- Decode:
- DetLabelEncode:
- RandomColorAdjust:
brightness: 0.1255 # 32.0 / 255
saturation: 0.5
- RandomHorizontalFlip:
p: 0.5
- RandomRotate:
Expand All @@ -101,15 +97,15 @@ train:
shrink_ratio: 0.4
thresh_min: 0.3
thresh_max: 0.7
- NormalizeImage:
bgr_to_rgb: False
is_hwc: True
mean: imagenet
std: imagenet
- ToCHWImage:
- RandomColorAdjust:
brightness: 0.1255 # 32.0 / 255
saturation: 0.5
- Normalize:
mean: &mean [ 123.675, 116.28, 103.53 ]
std: &std [ 58.395, 57.12, 57.375 ]
- HWC2CHW:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visualize
output_columns: [ 'image', 'binary_map', 'mask', 'thresh_map', 'thresh_mask' ] #'img_path']
# output_columns: ['image'] # for debug op performance
output_columns: [ 'image', 'binary_map', 'mask', 'thresh_map', 'thresh_mask' ]
net_input_column_index: [0] # input indices for network forward func in output_columns
label_column_index: [1, 2, 3, 4] # input indices marked as label

Expand All @@ -124,31 +120,22 @@ eval:
dataset_sink_mode: False
dataset:
type: DetDataset
mindrecord: True
dataset_root: /data/ocr_datasets
data_dir: ic15/det/test/ch4_test_images
label_file: ic15/det/test/det_gt.txt
data_dir: ic15/det/MR/test
label_file: ''
sample_ratio: 1.0
transform_pipeline:
- DecodeImage:
img_mode: RGB
to_float32: False
- Decode:
- DetLabelEncode:
#- GridResize:
# factor: 32
# GridResize already sets the evaluation size to [ 736, 1280 ].
# Uncomment ScalePadImage block for other resolutions.
# - ScalePadImage:
# target_size: [ 736, 1280 ] # h, w
- DetResize:
target_size: [ 736, 1280 ]
keep_ratio: False
force_divisable: True # GridResize 32
- NormalizeImage:
bgr_to_rgb: False
is_hwc: True
mean: imagenet
std: imagenet
- ToCHWImage:
force_divisible: True # GridResize 32
- Normalize:
mean: *mean
std: *std
- HWC2CHW:
# the order of the dataloader list, matching the network input and the labels for evaluation
output_columns: [ 'image', 'polys', 'ignore_tags', 'shape_list' ]
net_input_column_index: [0] # input indices for network forward func in output_columns
Expand Down
20 changes: 8 additions & 12 deletions configs/det/dbnet/db_r50_synthtext.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,12 +60,7 @@ train:
label_file: SynthText/gt_processed.mat
sample_ratio: 1.0
transform_pipeline:
- DecodeImage:
img_mode: RGB
to_float32: False
- RandomColorAdjust:
brightness: 0.1255 # 32.0 / 255
saturation: 0.5
- Decode:
- RandomHorizontalFlip:
p: 0.5
- RandomRotate:
Expand All @@ -88,12 +83,13 @@ train:
shrink_ratio: 0.4
thresh_min: 0.3
thresh_max: 0.7
- NormalizeImage:
bgr_to_rgb: False
is_hwc: True
mean: imagenet
std: imagenet
- ToCHWImage:
- RandomColorAdjust:
brightness: 0.1255 # 32.0 / 255
saturation: 0.5
- Normalize:
mean: [ 123.675, 116.28, 103.53 ]
std: [ 58.395, 57.12, 57.375 ]
- HWC2CHW:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visualize
output_columns: [ 'image', 'binary_map', 'mask', 'thresh_map', 'thresh_mask' ] #'img_path']
# output_columns: ['image'] # for debug op performance
Expand Down
2 changes: 1 addition & 1 deletion configs/det/east/east_r50_icdar15.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ eval:
- DetLabelEncode:
- DetResize:
target_size: [720, 1280]
force_divisable: False
force_divisible: False
- NormalizeImage:
bgr_to_rgb: False
is_hwc: True
Expand Down
4 changes: 2 additions & 2 deletions configs/rec/crnn/crnn_icdar15.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ train:
is_hwc: True
mean : [127.0, 127.0, 127.0]
std : [127.0, 127.0, 127.0]
- ToCHWImage:
- HWC2CHW:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
output_columns: ['image', 'text_seq'] #, 'length'] #'img_path']
net_input_column_index: [0] # input indices for network forward func in output_columns
Expand Down Expand Up @@ -147,7 +147,7 @@ eval:
is_hwc: True
mean : [127.0, 127.0, 127.0]
std : [127.0, 127.0, 127.0]
- ToCHWImage:
- HWC2CHW:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
output_columns: ['image', 'text_padded', 'text_length'] # TODO return text string padding w/ fixed length, and a scaler to indicate the length
net_input_column_index: [0] # input indices for network forward func in output_columns
Expand Down
4 changes: 2 additions & 2 deletions configs/rec/crnn/crnn_resnet34.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ train:
is_hwc: True
mean : [127.0, 127.0, 127.0]
std : [127.0, 127.0, 127.0]
- ToCHWImage:
- HWC2CHW:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
output_columns: ['image', 'text_seq'] #, 'length'] #'img_path']
net_input_column_index: [0] # input indices for network forward func in output_columns
Expand Down Expand Up @@ -140,7 +140,7 @@ eval:
is_hwc: True
mean : [127.0, 127.0, 127.0]
std : [127.0, 127.0, 127.0]
- ToCHWImage:
- HWC2CHW:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
output_columns: ['image', 'text_padded', 'text_length'] # TODO return text string padding w/ fixed length, and a scaler to indicate the length
net_input_column_index: [0] # input indices for network forward func in output_columns
Expand Down
4 changes: 2 additions & 2 deletions configs/rec/crnn/crnn_vgg7.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ train:
is_hwc: True
mean : [127.0, 127.0, 127.0]
std : [127.0, 127.0, 127.0]
- ToCHWImage:
- HWC2CHW:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
output_columns: ['image', 'text_seq'] #, 'length'] #'img_path']
net_input_column_index: [0] # input indices for network forward func in output_columns
Expand Down Expand Up @@ -141,7 +141,7 @@ eval:
is_hwc: True
mean : [127.0, 127.0, 127.0]
std : [127.0, 127.0, 127.0]
- ToCHWImage:
- HWC2CHW:
# the order of the dataloader list, matching the network input and the input labels for the loss function, and optional data for debug/visaulize
output_columns: ['image', 'text_padded', 'text_length'] # TODO return text string padding w/ fixed length, and a scaler to indicate the length
net_input_column_index: [0] # input indices for network forward func in output_columns
Expand Down
Loading