Skip to content

Commit

Permalink
[Docs] Translate data transform docs.
Browse files Browse the repository at this point in the history
  • Loading branch information
mzr1996 committed Nov 17, 2022
1 parent e98d262 commit 59f7414
Show file tree
Hide file tree
Showing 7 changed files with 179 additions and 26 deletions.
156 changes: 156 additions & 0 deletions docs/en/advanced_tutorials/data_transform.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
# Data transform

In the OpenMMLab repositories, dataset construction and data preparation are decoupled from each other.
Usually, the dataset construction only parses the dataset and records the basic information of each sample,
while the data preparation is performed by a series of data transforms, such as data loading, preprocessing,
and formatting based on the basic information of the samples.

## To use Data Transforms

In MMEngine, we use various callable data transforms classes to perform data manipulation. These data
transformation classes can accept several configuration parameters for instantiation and then process the
input data dictionary by calling. Also, all data transforms accept a dictionary as input and output the
processed data as a dictionary. A simple example is as belows:

```{note}
In MMEngine, we don't have the implementations of data transforms. you can find the base data transform class
and many other data transforms in MMCV. So you need to install MMCV before learning this tutorial, see the
{external+mmcv:doc}`MMCV installation guilds <get_started/installation>`.
```

```python
>>> import numpy as np
>>> from mmcv.transforms import Resize
>>>
>>> transform = Resize(scale=(224, 224))
>>> data_dict = {'img': np.random.rand(256, 256, 3)}
>>> data_dict = transform(data_dict)
>>> print(data_dict['img'].shape)
(224, 224, 3)
```

## To use in Config Files

In config files, we can compose multiple data transforms as a list, called a data pipeline. And the data
pipeline is an argument of the dataset.

Usually, a data pipeline consists of the following parts:

1. Data loading, use [`LoadImageFromFile`](mmcv.transforms.LoadImageFromFile) to load image files.
2. Label loading, use [`LoadAnnotations`](mmcv.transforms.LoadAnnotations) to load the bboxes, semantic segmentation and keypoint annotations.
3. Data processing and augmentation, like [`RandomResize`](mmcv.transforms.RandomResize).
4. Data formatting, we use different data transforms for different tasks. And the data transform for specified
task is implemented in the correspondding repository. For example, the data formatting transform for image
classification task is `PackClsInputs` and it's in MMClassification.

Here, taking the classification task as an example, we show a typical data pipeline in the figure below. For
each sample, the basic information stored in the dataset is a dictionary as shown on the far left side of the
figure, after which, every blue block represents a data transform, and in every data transform, we add some new fields (marked in green) or update some existing fields (marked in orange) in the data dictionary.

<div align=center>
<img src="https://user-images.githubusercontent.com/26739999/187157681-ac4dcac8-3543-4bfe-ab30-9aa9e56d4900.jpg" width="90%"/>
</div>

If want to use the above data pipeline in our config file, use the below settings:

```python
test_dataloader = dict(
batch_size=32,
dataset=dict(
type='ImageNet',
data_root='data/imagenet',
pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', size=256, keep_ratio=True),
dict(type='CenterCrop', crop_size=224),
dict(type='PackClsInputs'),
]
)
)
```

## Common Data Transforms

According to the functionality, the data transform classes can be divided into data loading, data
pre-processing & augmentation and data formatting.

### Data Loading

To support loading large-scale dataset, usually we won't load all dense data during dataset construction, but
only load the file path of these data. Therefore, we need to load these data in the data pipeline.

| Data Transforms | Functionality |
| :------------------------------------------------------: | :-----------------------------------------------------------------------------------: |
| [`LoadImageFromFile`](mmcv.transforms.LoadImageFromFile) | Load images according to the path. |
| [`LoadAnnotations`](mmcv.transforms.LoadImageFromFile) | Load and format annotations information, including bbox, segmentation map and others. |

### Data Pre-processing & Augmentation

Data transforms for pre-processing and augmentation usually manipulate the image and annotation data, like
cropping, padding, resizing and others.

| Data Transforms | Functionality |
| :--------------------------------------------------------: | :------------------------------------------------------------: |
| [`Pad`](mmcv.transforms.Pad) | Pad the margin of images. |
| [`CenterCrop`](mmcv.transforms.CenterCrop) | Crop the image and keep the center part. |
| [`Normalize`](mmcv.transforms.Normalize) | Normalize the image pixels. |
| [`Resize`](mmcv.transforms.Resize) | Resize images to the specified scale or ratio. |
| [`RandomResize`](mmcv.transforms.RandomResize) | Resize images to a random scale in the specified range. |
| [`RandomChoiceResize`](mmcv.transforms.RandomChoiceResize) | Resize images to a random scale from several specified scales. |
| [`RandomGrayscale`](mmcv.transforms.RandomGrayscale) | Randomly grayscale images. |
| [`RandomFlip`](mmcv.transforms.RandomFlip) | Randomly flip images. |

### Data Formatting

Data formatting transforms will convert the data to some specified type.

| Data Transforms | Functionality |
| :----------------------------------------------: | :---------------------------------------------------: |
| [`ToTensor`](mmcv.transforms.ToTensor) | Convert the data of specified field to `torch.Tensor` |
| [`ImageToTensor`](mmcv.transforms.ImageToTensor) | Convert images to `torch.Tensor` in PyTorch format. |

## Custom Data Transform Classes

To implement a new data transform class, the class needs to inherit `BaseTransform` and implement `transform`
method. Here, we use a simple flip transforms (`MyFlip`) as example:

```python
import random
import mmcv
from mmcv.transforms import BaseTransform, TRANSFORMS

@TRANSFORMS.register_module()
class MyFlip(BaseTransform):
def __init__(self, direction: str):
super().__init__()
self.direction = direction

def transform(self, results: dict) -> dict:
img = results['img']
results['img'] = mmcv.imflip(img, direction=self.direction)
return results
```

Then, we can instantiate a `MyFlip` object and use it to process our data dictionary.

```python
import numpy as np

transform = MyFlip(direction='horizontal')
data_dict = {'img': np.random.rand(224, 224, 3)}
data_dict = transform(data_dict)
processed_img = data_dict['img']
```

Or, use it in the data pipeline by modifying our config file:

```python
pipeline = [
...
dict(type='MyFlip', direction='horizontal'),
...
]
```

Please note that to use the class in our config file, we need to confirm the `MyFlip` class will be imported
during running.
2 changes: 1 addition & 1 deletion docs/en/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@
'python': ('https://docs.python.org/3', None),
'numpy': ('https://numpy.org/doc/stable', None),
'torch': ('https://pytorch.org/docs/stable/', None),
'mmcv': ('https://mmcv.readthedocs.io/en/dev-2.x/', None),
'mmcv': ('https://mmcv.readthedocs.io/en/2.x/', None),
}

# Add any paths that contain templates here, relative to this directory.
Expand Down
2 changes: 1 addition & 1 deletion docs/en/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,13 @@ You can switch between Chinese and English documents in the lower-left corner of
tutorials/evaluation.md
tutorials/optim_wrapper.md
tutorials/param_scheduler.md
tutorials/data_transform.md

.. toctree::
:maxdepth: 1
:caption: Advanced tutorials

advanced_tutorials/basedataset.md
advanced_tutorials/data_transform.md
advanced_tutorials/data_element.md
advanced_tutorials/visualization.md
advanced_tutorials/initialize.md
Expand Down
3 changes: 0 additions & 3 deletions docs/en/tutorials/data_transform.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
在 MMEngine 中,我们使用各种可调用的数据变换类来进行数据的操作。这些数据变换类可以接受若干配置参数进行实例化,之后通过调用的方式对输入的数据字典进行处理。同时,我们约定所有数据变换都接受一个字典作为输入,并将处理后的数据输出为一个字典。一个简单的例子如下:

```{note}
MMEngine 中仅约定了数据变换类的规范,常用的数据变换类实现及基类都在 MMCV 中,因此在本篇教程需要提前安装好 MMCV,参见 MMCV 的[安装教程](https://mmcv.readthedocs.io/en/2.x/get_started/installation.html)
MMEngine 中仅约定了数据变换类的规范,常用的数据变换类实现及基类都在 MMCV 中,因此在本篇教程需要提前安装好 MMCV,参见 {external+mmcv:doc}`MMCV 安装教程 <get_started/installation>`
```

```python
Expand Down Expand Up @@ -62,34 +62,34 @@ test_dataloader = dict(

为了支持大规模数据集的加载,通常在数据集初始化时不加载数据,只加载相应的路径。因此需要在数据流水线中进行具体数据的加载。

| 数据变换类 | 功能 |
| :---------------------------------------------------------------------: | :---------------------------------------: |
| [`LoadImageFromFile`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 根据路径加载图像 |
| [`LoadAnnotations`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 加载和组织标注信息,如 bbox、语义分割图等 |
| 数据变换类 | 功能 |
| :------------------------------------------------------: | :---------------------------------------: |
| [`LoadImageFromFile`](mmcv.transforms.LoadImageFromFile) | 根据路径加载图像 |
| [`LoadAnnotations`](mmcv.transforms.LoadAnnotations) | 加载和组织标注信息,如 bbox、语义分割图等 |

### 数据预处理及增强

数据预处理和增强通常是对图像本身进行变换,如裁剪、填充、缩放等。

| 数据变换类 | 功能 |
| :----------------------------------------------------------------------: | :--------------------------------: |
| [`Pad`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 填充图像边缘 |
| [`CenterCrop`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 居中裁剪 |
| [`Normalize`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 对图像进行归一化 |
| [`Resize`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 按照指定尺寸或比例缩放图像 |
| [`RandomResize`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 缩放图像至指定范围的随机尺寸 |
| [`RandomChoiceResize`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 缩放图像至多个尺寸中的随机一个尺寸 |
| [`RandomGrayscale`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 随机灰度化 |
| [`RandomFlip`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 图像随机翻转 |
| 数据变换类 | 功能 |
| :--------------------------------------------------------: | :--------------------------------: |
| [`Pad`](mmcv.transforms.Pad) | 填充图像边缘 |
| [`CenterCrop`](mmcv.transforms.CenterCrop) | 居中裁剪 |
| [`Normalize`](mmcv.transforms.Normalize) | 对图像进行归一化 |
| [`Resize`](mmcv.transforms.Resize) | 按照指定尺寸或比例缩放图像 |
| [`RandomResize`](mmcv.transforms.RandomResize) | 缩放图像至指定范围的随机尺寸 |
| [`RandomChoiceResize`](mmcv.transforms.RandomChoiceResize) | 缩放图像至多个尺寸中的随机一个尺寸 |
| [`RandomGrayscale`](mmcv.transforms.RandomGrayscale) | 随机灰度化 |
| [`RandomFlip`](mmcv.transforms.RandomFlip) | 图像随机翻转 |

### 数据格式化

数据格式化操作通常是对数据进行的类型转换。

| 数据变换类 | 功能 |
| :-----------------------------------------------------------------: | :-------------------------------: |
| [`ToTensor`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 将指定的数据转换为 `torch.Tensor` |
| [`ImageToTensor`](https://mmcv.readthedocs.io/en/2.x/api.html#TODO) | 将图像转换为 `torch.Tensor` |
| 数据变换类 | 功能 |
| :----------------------------------------------: | :-------------------------------: |
| [`ToTensor`](mmcv.transforms.ToTensor) | 将指定的数据转换为 `torch.Tensor` |
| [`ImageToTensor`](mmcv.transforms.ImageToTensor) | 将图像转换为 `torch.Tensor` |

## 自定义数据变换类

Expand Down
2 changes: 1 addition & 1 deletion docs/zh_cn/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@
'python': ('https://docs.python.org/3', None),
'numpy': ('https://numpy.org/doc/stable', None),
'torch': ('https://pytorch.org/docs/stable/', None),
'mmcv': ('https://mmcv.readthedocs.io/en/2.x/', None),
'mmcv': ('https://mmcv.readthedocs.io/zh_CN/2.x/', None),
}

# Add any paths that contain templates here, relative to this directory.
Expand Down
2 changes: 1 addition & 1 deletion docs/zh_cn/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,13 @@
tutorials/evaluation.md
tutorials/optim_wrapper.md
tutorials/param_scheduler.md
tutorials/data_transform.md

.. toctree::
:maxdepth: 1
:caption: 进阶教程

advanced_tutorials/basedataset.md
advanced_tutorials/data_transform.md
advanced_tutorials/data_element.md
advanced_tutorials/visualization.md
advanced_tutorials/initialize.md
Expand Down

0 comments on commit 59f7414

Please sign in to comment.