[Doc][Fix]Refine train, predict and evaluate docs for release 2.7 (#2780

)
PaddlePaddle · Nov 29, 2022 · 0af99d5 · 0af99d5
1 parent 7c4f3e1
commit 0af99d5
Show file tree

Hide file tree

Showing 37 changed files with 441 additions and 299 deletions.
diff --git a/README_CN.md b/README_CN.md
@@ -44,7 +44,7 @@
 
 ## <img src="./docs/images/feature.png" width="20"/> 特性
 
-* **高精度**：跟踪学术界的前沿分割技术，结合半监督标签知识蒸馏方案([SSLD](https://paddleclas.readthedocs.io/zh_CN/latest/advanced_tutorials/distillation/distillation.html#ssld))训练的骨干网络，提供40+主流分割网络、140+的高质量预训练模型，效果优于其他开源实现。
+* **高精度**：跟踪学术界的前沿分割技术，结合高精度训练的骨干网络，提供40+主流分割网络、140+的高质量预训练模型，效果优于其他开源实现。
 
 * **高性能**：使用多进程异步I/O、多卡并行训练、评估等加速策略，结合飞桨核心框架的显存优化功能，大幅度减少分割模型的训练开销，让开发者更低成本、更高效地完成图像分割训练。
 
@@ -367,15 +367,16 @@
 * [安装说明](./docs/install_cn.md)
 * [快速体验](./docs/quick_start_cn.md)
 * [20分钟快速上手PaddleSeg](./docs/whole_process_cn.md)
+* [模型库](./docs/model_zoo_overview_cn.md)
 
 **基础教程**
 
-* 准备数据
+* 准备数据集
    * [准备公开数据集](./docs/data/pre_data_cn.md)
    * [准备自定义数据集](./docs/data/marker/marker_cn.md)
    * [EISeg 数据标注](./EISeg)
-
-* [模型训练](/docs/train/train_cn.md)
+* [准备配置文件](./docs/config/pre_config_cn.md)
+* [模型训练](./docs/train/train_cn.md)
 * [模型评估](./docs/evaluation/evaluate_cn.md)
 * [模型预测](./docs/predict/predict_cn.md)
 
@@ -387,7 +388,7 @@
     * [Paddle Inference部署(Python)](./docs/deployment/inference/python_inference_cn.md)
     * [Paddle Inference部署(C++)](./docs/deployment/inference/cpp_inference_cn.md)
     * [Paddle Lite部署](./docs/deployment/lite/lite_cn.md)
-    * [Paddle Serving部署](./docs/deployment/serving/serving.md)
+    * [Paddle Serving部署](./docs/deployment/serving/serving_cn.md)
     * [Paddle JS部署](./docs/deployment/web/web_cn.md)
     * [推理Benchmark](./docs/deployment/inference/infer_benchmark_cn.md)
 

diff --git a/README_EN.md b/README_EN.md
@@ -49,7 +49,7 @@ PaddleSeg is an end-to-end high-efficent development toolkit for image segmentat
 
 ## <img src="./docs/images/feature.png" width="20"/> Features
 
-* **High-Performance Model**: Following the state of the art segmentation methods and use the high-performance backbone trained by semi-supervised label knowledge distillation scheme ([SSLD]((https://paddleclas.readthedocs.io/zh_CN/latest/advanced_tutorials/distillation/distillation.html#ssld))), we provide 40+ models and 140+ high-quality pre-training models, which are better than other open-source implementations.
+* **High-Performance Model**: Following the state of the art segmentation methods and use the high-performance backbone, we provide 40+ models and 140+ high-quality pre-training models, which are better than other open-source implementations.
 
 * **High Efficiency**: PaddleSeg provides multi-process asynchronous I/O, multi-card parallel training, evaluation, and other acceleration strategies, combined with the memory optimization function of the PaddlePaddle, which can greatly reduce the training overhead of the segmentation model, all this allowing developers to lower cost and more efficiently train image segmentation model.
 
@@ -369,24 +369,25 @@ Note that:
 
 * [Installation](./docs/install.md)
 * [Quick Start](./docs/quick_start.md)
-* [A 20 minutes Blitz to learn PaddleSeg](./docs/whole_process.md)
+* [A 20 minutes Blitz to Learn PaddleSeg](./docs/whole_process.md)
+* [Model Zoo](./docs/model_zoo_overview.md)
 
 **Basic Tutorials**
 
-*  Data Preparation
+* Data Preparation
     * [Prepare Public Dataset](./docs/data/pre_data.md)
     * [Prepare Customized Dataset](./docs/data/marker/marker.md)
     * [Label Data with EISeg](./EISeg)
-
+* [Config Preparation](./docs/config/pre_config.md)
 * [Model Training](/docs/train/train.md)
 * [Model Evaluation](./docs/evaluation/evaluate.md)
-* [Prediction](./docs/predict/predict.md)
+* [Model Prediction](./docs/predict/predict.md)
 
 * Model Export
     * [Export Inference Model](./docs/model_export.md)
     * [Export ONNX Model](./docs/model_export_onnx.md)
 
-*  Model Deploy
+* Model Deploy
     * [Paddle Inference (Python)](./docs/deployment/inference/python_inference.md)
     * [Paddle Inference (C++)](./docs/deployment/inference/cpp_inference.md)
     * [Paddle Lite](./docs/deployment/lite/lite.md)
@@ -414,7 +415,7 @@ Note that:
     * [Create Your Own Model](./docs/design/create/add_new_model.md)
 *  Pull Request
     * [PR Tutorial](./docs/pr/pr/pr.md)
-    * [PR Style](./docs/pr/pr/style_cn.md)
+    * [PR Style](./docs/pr/pr/style.md)
 
 ## <img src="./docs/images/anli.png" width="20"/> Special Features
   * [Interactive Segmentation](./EISeg)

diff --git a/docs/config/pre_config.md b/docs/config/pre_config.md
@@ -0,0 +1,99 @@
+English | [简体中文 ](pre_config_cn.md)
+# Config Preparation
+
+The config file contains the information of train dataset, val dataset, optimizer, loss and model in PaddleSeg.
+All config files of SOTA models are saved in `PaddleSeg/configs`.
+Based on these config files, we can modify the content at will and then conduct model training.
+
+The config file of `PaddleSeg/configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml` are as following.
+
+## Explain Details
+
+PaddleSeg employes the config file to build dataset, optimizer, model, etc, and then it conducts model training, evaluation and exporting.
+
+Hyperparameters have batch_size and iters.
+
+In each config module, `type` is the class name of corresponding component, and other values are the input params of `__init__` function.
+
+For dataset config module,  the supported classes in `PaddleSeg/paddleseg/datasets` are registered by `@manager.DATASETS.add_component`.
+
+For data transforms config module, the supported classes in `PaddleSeg/paddleseg/transforms/transforms.py` are registered by `@manager.TRANSFORMS.add_component`.
+
+For optimizer config module, it supports all optimizer provided by PaddlePaddle. Please refer to the [document](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/optimizer/Overview_cn.html#api).
+
+For lr_scheduler config module, it supports all lr_scheduler provided by PaddlePaddle. Please refer to the [document](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/optimizer/Overview_cn.html#about-lr).
+
+For loss config module, `types` containes several loss name, `coef` defines the weights of each loss. The number of losses and weights must be equal. If all losses are the same, we can only add one loss name. All supported classes in `PaddleSeg/paddleseg/models/losses/` are registered by `@manager.LOSSES.add_component`.
+
+For model config module, the supported classes in `PaddleSeg/paddleseg/models/` are registered by `@manager.MODELS.add_component`, and the supported backbone in `PaddleSeg/paddleseg/models/backbones` are registered by `@manager.BACKBONES.add_component`.
+
+
+## Config File Demo
+
+```
+batch_size: 4  # batch size on single GPU
+iters: 1000  
+
+train_dataset:
+  type: Dataset
+  dataset_root: data/optic_disc_seg
+  train_path: data/optic_disc_seg/train_list.txt
+  num_classes: 2  # background is also a class
+  mode: train
+  transforms:
+    - type: ResizeStepScaling
+      min_scale_factor: 0.5
+      max_scale_factor: 2.0
+      scale_step_size: 0.25
+    - type: RandomPaddingCrop
+      crop_size: [512, 512]
+    - type: RandomHorizontalFlip
+    - type: RandomDistort
+      brightness_range: 0.5
+      contrast_range: 0.5
+      saturation_range: 0.5
+    - type: Normalize
+
+val_dataset:  
+  type: Dataset
+  dataset_root: data/optic_disc_seg
+  val_path: data/optic_disc_seg/val_list.txt
+  num_classes: 2
+  mode: val
+  transforms:
+    - type: Normalize
+
+optimizer:
+  type: sgd
+  momentum: 0.9
+  weight_decay: 4.0e-5
+
+lr_scheduler:
+  type: PolynomialDecay
+  learning_rate: 0.01
+  power: 0.9
+  end_lr: 0
+
+loss:
+  types:
+    - type: CrossEntropyLoss
+  coef: [1, 1, 1] # total_loss = coef_1 * loss_1 + .... + coef_n * loss_n
+
+model:
+  type: PPLiteSeg  
+  backbone:  
+    type: STDC2
+    pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz
+
+```
+
+## Others
+
+Note that:
+* In the data transforms of train and val dataset, PaddleSeg will add read image operation in the beginning, add HWC->CHW transform operation in the end.
+* For the config files in `PaddleSeg/configs/quick_start`, the learning_rate is corresponding to single GPU training. For other config files, the learning_rate is corresponding to 4 GPU training.
+
+Besides, one config file can include another config file. For example, the right `deeplabv3p_resnet50_os8_cityscapes_1024x512_80k.yml` uses `_base_` to include the left `../_base_/cityscapes.yml`.
+If config value `X` in both config files (`A` includes `B`), the `X` value in `B` will be hidden.
+
+![](./images/fig3.png)
diff --git a/docs/config/pre_config_cn.md b/docs/config/pre_config_cn.md
@@ -0,0 +1,105 @@
+简体中文 | [English](pre_config.md)
+
+# 准备配置文件
+
+PaddleSeg的配置文件按照模块化进行定义，包括超参、训练数据集、验证数据集、优化器、损失函数、模型等模块信息。
+
+不同模块信息都对应PaddleSeg中定义的模块类，所以PaddleSeg基于配置文件构建对应的模块，进行模型训练、评估和导出。
+
+PaddleSeg中所有语义分割模型都针对公开数据集，提供了对应的配置文件，保存在`PaddleSeg/configs`目录下。
+
+下面是`PaddleSeg/configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml`配置文件。我们以这个配置文件为例进行详细解读，让大家熟悉修改配置文件的方法。
+
+## 详细解读
+
+超参主要包括batch_size和iters，前者是单卡的batch_size，后者表示训练迭代的轮数（单个batch进行一次前向和反向表示一轮）。
+
+每个模块信息中，`type`字段对应到PaddleSeg代码中的模块类名(python class name)，其他字段对应模块类`__init__`函数的初始化参数。所以大家需要参考PaddleSeg代码中的模块类来修改模块信息。
+
+数据集dataset模块，支持的dataset类在`PaddleSeg/paddleseg/datasets`[目录](../../paddleseg/datasets/)下，使用`@manager.DATASETS.add_component`进行注册。
+
+数据预处理方式transforms模块，支持的transform类在`PaddleSeg/paddleseg/transforms/transforms.py`[文件](../../paddleseg/transforms/transforms.py)中，使用`@manager.TRANSFORMS.add_component`进行注册。
+
+优化器optimizer模块，支持Paddle提供的所有优化器类，具体参考[文档](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/optimizer/Overview_cn.html#api)。
+
+学习率衰减lr_scheduler模块，支持Paddle提供的所有lr_scheduler类，具体参考[文档](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/optimizer/Overview_cn.html#about-lr)。
+
+损失函数Loss模块，在`types`字段下分别定义使用的损失函数类，`coef`字段定义每个损失函数的权重。`types`字段下损失函数个数，应该等于`coef`字段数组的长度。如果所有损失函数相同，可以只定义一个损失函数。支持的损失函数类在`PaddleSeg/paddleseg/models/losses/`[目录](../../paddleseg/models/losses/)下，使用`@manager.LOSSES.add_component注册`。
+
+模型Model模块，支持的model类在`PaddleSeg/paddleseg/models/`[目录](../../paddleseg/models)下，使用`@manager.MODELS.add_component`注册。
+
+模型Model模块，支持的backbone类在`PaddleSeg/paddleseg/models/backbones`[目录](../../paddleseg/models/backbones/)下，使用`@manager.BACKBONES.add_component`注册。
+
+## 配置文件示例
+
+```
+batch_size: 4  #设定batch_size的值即为迭代一次送入网络的图片数量，一般显卡显存越大，batch_size的值可以越大。如果使用多卡训练，总得batch size等于该batch size乘以卡数。
+iters: 1000    #模型训练迭代的轮数
+
+train_dataset:  #训练数据设置
+  type: Dataset #指定加载数据集的类。数据集类的代码在`PaddleSeg/paddleseg/datasets`目录下。
+  dataset_root: data/optic_disc_seg #数据集路径
+  train_path: data/optic_disc_seg/train_list.txt  #数据集中用于训练的标识文件
+  num_classes: 2  #指定类别个数（背景也算为一类）
+  mode: train #表示用于训练
+  transforms: #模型训练的数据预处理方式。
+    - type: ResizeStepScaling #将原始图像和标注图像随机缩放为0.5~2.0倍
+      min_scale_factor: 0.5
+      max_scale_factor: 2.0
+      scale_step_size: 0.25
+    - type: RandomPaddingCrop #从原始图像和标注图像中随机裁剪512x512大小
+      crop_size: [512, 512]
+    - type: RandomHorizontalFlip  #对原始图像和标注图像随机进行水平反转
+    - type: RandomDistort #对原始图像进行亮度、对比度、饱和度随机变动，标注图像不变
+      brightness_range: 0.5
+      contrast_range: 0.5
+      saturation_range: 0.5
+    - type: Normalize #对原始图像进行归一化，标注图像保持不变
+
+val_dataset:  #验证数据设置
+  type: Dataset #指定加载数据集的类。数据集类的代码在`PaddleSeg/paddleseg/datasets`目录下。
+  dataset_root: data/optic_disc_seg #数据集路径
+  val_path: data/optic_disc_seg/val_list.txt  #数据集中用于验证的标识文件
+  num_classes: 2  #指定类别个数（背景也算为一类）
+  mode: val #表示用于验证
+  transforms: #模型验证的数据预处理的方式
+    - type: Normalize #对原始图像进行归一化，标注图像保持不变
+
+optimizer: #设定优化器的类型
+  type: sgd #采用SGD（Stochastic Gradient Descent）随机梯度下降方法为优化器
+  momentum: 0.9 #设置SGD的动量
+  weight_decay: 4.0e-5 #权值衰减，使用的目的是防止过拟合
+
+lr_scheduler: # 学习率的相关设置
+  type: PolynomialDecay # 一种学习率类型。共支持12种策略
+  learning_rate: 0.01 # 初始学习率
+  power: 0.9
+  end_lr: 0
+
+loss: #设定损失函数的类型
+  types:
+    - type: CrossEntropyLoss  #CE损失
+  coef: [1, 1, 1] # PP-LiteSeg有一个主loss和两个辅助loss，coef表示权重，所以 total_loss = coef_1 * loss_1 + .... + coef_n * loss_n
+
+model:  #模型说明
+  type: PPLiteSeg  #设定模型类别
+  backbone:  # 设定模型的backbone，包括名字和预训练权重
+    type: STDC2
+    pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz
+
+```
+
+## 其他
+
+注意：
+- 对于训练和测试数据集的预处理，PaddleSeg默认会添加读取图像操作、HWC转CHW的操作，所以这两个操作不用添加到transform配置字段中。
+- 只有"PaddleSeg/configs/quick_start"下面配置文件中的学习率为单卡学习率，其他配置文件中均为4卡的学习率。如果大家单卡训练来复现公开数据集上的指标，学习率设置应变成原来的1/4。
+
+
+上面我们介绍的PP-LiteSeg配置文件，所有的配置信息都放置在同一个yml文件中。为了具有更好的复用性，PaddleSeg的配置文件采用了更加耦合的设计，配置文件支持包含复用。
+
+如下图，右侧`deeplabv3p_resnet50_os8_cityscapes_1024x512_80k.yml`配置文件通过`_base_: '../_base_/cityscapes.yml'`来包含左侧`cityscapes.yml`配置文件，其中`_base_: `设置的是被包含配置文件相对于该配置文件的路径。
+
+如果两个配置文件具有相同的字段信息，被包含的配置文件中的字段信息会被覆盖。如下图，1号配置文件可以覆盖2号配置文件的字段信息。
+
+![](./images/fig3.png)
diff --git a/docs/data/custom/data_prepare.md b/docs/data/custom/data_prepare.md
@@ -1,4 +1,4 @@
-English|[简体中文](data_prepare_cn.md)
+English | [简体中文](data_prepare_cn.md)
 # Prepare Custom Dataset Data
 If you want to train on custom dataset, please prepare the dataset using following steps.
 

diff --git a/docs/data/custom/data_prepare_cn.md b/docs/data/custom/data_prepare_cn.md
@@ -1,4 +1,4 @@
-简体中文|[English](data_prepare.md)
+简体中文 | [English](data_prepare.md)
 # 准备自定义数据集
 如果您需要使用自定义数据集进行训练，请按照以下步骤准备数据。
 

diff --git a/docs/data/marker/LabelMe.md b/docs/data/marker/LabelMe.md
@@ -1,4 +1,4 @@
-English|[简体中文](LabelMe_cn.md)
+English | [简体中文](LabelMe_cn.md)
 # LabelMe
 
 If you have not installed it before, please refer to [LabelMe installation](https://paddlex.readthedocs.io/zh_CN/develop/data/annotation/labelme.html)

diff --git a/docs/data/marker/LabelMe_cn.md b/docs/data/marker/LabelMe_cn.md
@@ -1,4 +1,4 @@
-简体中文|[English](LabelMe.md)
+简体中文 | [English](LabelMe.md)
 # LabelMe
 
 如您先前并无安装，那么LabelMe的安装可参考[LabelMe安装和启动](https://paddlex.readthedocs.io/zh_CN/develop/data/annotation/labelme.html)

diff --git a/docs/deployment/inference/infer_benchmark.md b/docs/deployment/inference/infer_benchmark.md
@@ -1,4 +1,4 @@
-English|[简体中文](infer_benchmark_cn.md)
+English | [简体中文](infer_benchmark_cn.md)
 # Inference Benchmark
 
 Test Environment：

diff --git a/docs/deployment/inference/infer_benchmark_cn.md b/docs/deployment/inference/infer_benchmark_cn.md
@@ -1,4 +1,4 @@
-简体中文|[English](infer_benchmark.md)
+简体中文 | [English](infer_benchmark.md)
 # 推理 Benchmark
 
 测试环境：

diff --git a/docs/deployment/inference/python_inference.md b/docs/deployment/inference/python_inference.md
@@ -1,4 +1,4 @@
-English|[简体中文](python_inference_cn.md)
+English | [简体中文](python_inference_cn.md)
 # Paddle Inference Deployment（Python）
 
 ## 1. Description

diff --git a/docs/deployment/inference/python_inference_cn.md b/docs/deployment/inference/python_inference_cn.md
@@ -1,4 +1,4 @@
-简体中文|[English](python_inference.md)
+简体中文 | [English](python_inference.md)
 # Paddle Inference部署（Python）
 
 ## 1. 说明

diff --git a/docs/deployment/lite/lite.md b/docs/deployment/lite/lite.md
@@ -1,3 +1,4 @@
+English | [简体中文](lite_cn.md)
 # Deployment by PaddleLite
 
 ## 1. Introduction

diff --git a/docs/deployment/lite/lite_cn.md b/docs/deployment/lite/lite_cn.md
@@ -1,3 +1,5 @@
+简体中文 | [English](lite.md)
+
 # 移动端Lite部署
 
 ## 1.介绍
@@ -45,11 +47,11 @@ Paddle-Lite的编译目前支持Docker，Linux和Mac OS开发环境，建议使
 
 * 使用预编译版本的预测库，最新的预编译文件参考：[release](https://github.com/PaddlePaddle/Paddle-Lite/releases/)，此demo使用的[版本](https://paddlelite-demo.bj.bcebos.com/libs/android/paddle_lite_libs_v2_8_0.tar.gz)
 
-	解压上面文件，PaddlePredictor.jar位于：java/PaddlePredictor.jar；
+    解压上面文件，PaddlePredictor.jar位于：java/PaddlePredictor.jar；
 
-	arm64-v8a相关so位于：java/libs/arm64-v8a；
+    arm64-v8a相关so位于：java/libs/arm64-v8a；
 
-	armeabi-v7a相关so位于：java/libs/armeabi-v7a；
+    armeabi-v7a相关so位于：java/libs/armeabi-v7a；
 
 * 手动编译Paddle-Lite预测库
 开发环境的准备和编译方法参考：[Paddle-Lite源码编译](https://paddle-lite.readthedocs.io/zh/release-v2.8/source_compile/compile_env.html)。

diff --git a/docs/deployment/serving/serving.md b/docs/deployment/serving/serving.md
@@ -1,4 +1,4 @@
-English|[简体中文](serving_cn.md)
+English | [简体中文](serving_cn.md)
 # Paddle Serving deployment
 
 ## Overview

diff --git a/docs/deployment/serving/serving_cn.md b/docs/deployment/serving/serving_cn.md
@@ -1,4 +1,4 @@
-简体中文|[English](serving.md)
+简体中文 | [English](serving.md)
 # Paddle Serving部署
 
 ## 概述

diff --git a/docs/deployment/slim/distill/distill.md b/docs/deployment/slim/distill/distill.md
@@ -1,3 +1,5 @@
+English | [简体中文](distill_cn.md)
+
 # Model Distillation Tutorial
 
 # 1. Introduction

diff --git a/docs/deployment/slim/distill/distill_cn.md b/docs/deployment/slim/distill/distill_cn.md
@@ -1,3 +1,5 @@
+简体中文 | [English](distill.md)
+
 # 模型蒸馏教程
 
 ## 1 简介

diff --git a/docs/deployment/slim/quant/quant.md b/docs/deployment/slim/quant/quant.md
@@ -1,3 +1,5 @@
+English | [简体中文](quant_cn.md)
+
 # Model Quantization Tutorial