Bobholamovic 2 жил өмнө
parent
commit
e3716ab300

+ 2 - 2
docs/CONTRIBUTING_CN.md

@@ -8,7 +8,7 @@
 
 ### 1 代码贡献步骤
 
-PaddleRS使用[Git](https://git-scm.com/doc)作为版本控制工具,并托管在GitHub平台。这意味着,在贡献代码前,您需要熟悉git相关操作,并且对以[pull request (PR)](https://docs.github.com/cn/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests)为基础的GitHub工作流有所了解。
+PaddleRS使用[Git](https://git-scm.com/doc)作为版本控制工具,并托管在GitHub平台。这意味着,在贡献代码前,您需要熟悉Git相关操作,并且对以[pull request (PR)](https://docs.github.com/cn/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests)为基础的GitHub工作流有所了解。
 
 为PaddleRS贡献代码的具体步骤如下:
 
@@ -78,7 +78,7 @@ PaddleRS对代码风格的规范基本与[Google Python风格规范](https://zh-
 
 - 括号:括号可以用于行连接,但是不要在`if`判断中使用没有必要的括号。
 
-- 异常:抛出和捕获异常时使用尽可能具体的异常类型,几乎永远不要使用基类`Exception`(除非目的是捕获不限类型的任何异常)。
+- 异常:抛出和捕获异常时使用尽可能具体的异常类型,几乎永远不要使用基类`Exception`和`BaseException`(除非目的是捕获不限类型的任何异常)。
 
 - 注释:所有注释使用英文书写。所有提供给用户的API都必须添加docstring,且至少具有“API功能描述”和“API参数”两个部分。使用三双引号`"""`包围一个docstring。docstring书写的具体细节可参考[《代码注释规范》](dev/docstring_cn.md)。
 

+ 22 - 22
docs/CONTRIBUTING_EN.md

@@ -4,22 +4,22 @@
 
 ## Contribute Code
 
-This guide starts with the necessary steps to contribute code to PaddleRS, and then goes into details on self-inspection on newly added files, code style specification, and testing steps.
+This guide starts with the necessary steps to contribute code to PaddleRS, and then goes into details on self-inspection on newly added files, code style specifications, and testing steps.
 
 ### 1 Code Contribution Steps
 
-PaddleRS uses [Git](https://git-scm.com/doc) as a version control tool and is hosted on GitHub. This means that you need to be familiar with git before contributing code. And you need to be familiar with [pull request (PR)](https://docs.github.com/cn/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests) based on the GitHub workflow.
+PaddleRS uses [Git](https://git-scm.com/doc) as the version control tool and is hosted on GitHub. This means that you need to be familiar with Git before contributing code. And you need to be familiar with GitHub workflows based on [pull requests (PRs)](https://docs.github.com/cn/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests).
 
 The steps to contribute code to PaddleRS are as follows:
 
 1. Fork the official PaddleRS repository on GitHub, clone the code locally, and pull the develop branch.
 2. Write code according to [Dev Guide](dev/dev_guide_en.md) (it is recommended to develop on a new feature branch).
-3. Install pre-commit hooks to perform code style checks before each commit. Refer to [Code style specification](#3-code-style-specification).
-4. Write unit tests for the new code and make sure all the tests are successful. Refer to [Test related steps](#4-test-related-steps).
+3. Install pre-commit hooks to perform code style checks before each commit. Refer to [code style specifications](#3-code-style-specifications).
+4. Write unit tests for the new code and make sure all the tests are successful. Refer to [test related steps](#4-test-related-steps).
 5. Create a new PR for your branch and ensure that the CLA is signed and the CI/CE finish with no errors. After that, a PaddleRS team member will review the code you contributed.
 6. Modify the code according to the review and resubmit it until PR is merged or closed.
 
-If you contribute code that uses a third-party library that PaddleRS does not currently rely on, please explain when you submit your PR. Also, you should explain why this third-party library need to be used.
+If you contribute code that uses a third-party library that PaddleRS does not currently depends on, please explain when you submit your PR. Also, you should explain why this third-party library need to be used.
 
 ### 2 Self-Check on Added Files
 
@@ -45,15 +45,15 @@ Copyright information must be added to each new file in PaddleRS, as shown below
 # limitations under the License.
 ```
 
-*Note: The year in copyright information needs to be rewritten according to the current natural year.*
+*Note: The year in copyright information needs to be replaced by the current natural year.*
 
-#### 2.2 The Order of Module Import
+#### 2.2 Order of Import Statements
 
-All global import statements must be at the beginning of the module, right after the copyright information. Import packages or modules in the following order:
+All global import statements must be put at the beginning of the module, right after the copyright information. Import packages or modules in the following order:
 
 1. Python standard libraries;
-2. Third-party libraries installed through package managers such as `pip`(note that `paddle` is a third-party library, but `paddlers` is not itself a third-party library);
-3. `paddlers` and `paddlers` subpackages and modules.
+2. Third-party libraries installed through package managers such as `pip` (note that `paddle` is a third-party library, but `paddlers` is not itself a third-party library);
+3. `paddlers` and its subpackages and modules.
 
 There should be a blank line between import statements of different types. The file should not contain import statements for unused packages or modules. In addition, if the length of the imported statements varies greatly, you are advised to arrange them in ascending order. An example is shown below:
 
@@ -68,9 +68,9 @@ import paddlers.transforms as T
 from paddlers.transforms import DecodeImg
 ```
 
-### 3 Code Style Specification
+### 3 Code Style Specifications
 
-PaddleRS' code style specification is basically the same as the [Google Python Style specification](https://zh-google-styleguide.readthedocs.io/en/latest/google-python-styleguide/python_style_rules/), except that PaddleRS does not enforce type annotation (i.e. type hints, refer to [PEP 483](https://peps.python.org/pep-0483/) and [PEP 484](https://peps.python.org/pep-0484/)). Some of the important code style specifications are:
+The code style guidelines of PaddleRS are basically the same as the [Google Python style rules](https://zh-google-styleguide.readthedocs.io/en/latest/google-python-styleguide/python_style_rules/), except that PaddleRS does not enforce type annotation (i.e. type hints, please refer to [PEP 483](https://peps.python.org/pep-0483/) and [PEP 484](https://peps.python.org/pep-0484/)). Some of the important code style specifications are:
 
 - Blank line: Two empty lines between top-level definitions (such as top-level function or class definitions). There should be a blank line between the definitions of different methods within the class. Inside the function you need to be careful to add a blank line where there is a logical break.
 
@@ -78,9 +78,9 @@ PaddleRS' code style specification is basically the same as the [Google Python S
 
 - Parentheses: Parentheses can be used for line concatenation, but do not use unnecessary parentheses in `if` conditions.
 
-- Exceptions: Throw and catch exceptions with as specific an exception type as possible, and almost never use the base class `Exception` (unless the purpose is to catch any exception of any type).
+- Exceptions: Throw and catch exceptions with as specific an exception type as possible, and almost never use the base classes `Exception` and `BaseException` (unless the purpose is to catch all exceptions).
 
-- Comments: All comments should be written in English. All apis provided to users must have docstrings added with at least two sections, "API Function Description" and "API Parameters". Surround a docstring with three double quotes `"""`. See the [Code Comment Specification](dev/docstring_en.md) for details on docstring writing.
+- Comments: All comments should be written in English. All APIs provided to users must have docstrings added with at least two parts, function description and function parameters. Surround a docstring with three double quotes `"""`. See the [Code Comment Specification](dev/docstring_en.md) for details on writing docstrings.
 
 - Naming: Variable names of different types apply the following case rules: module name: `module_name`; package name: `package_name`; class name: `ClassName`; method name: `method_name`; function name: `function_name`; name of a global constant (a variable whose value does not change during the running of the program) : `GLOBAL_CONSTANT_NAME`; global variable name: `global_var_name`; instance name: `instance_var_name`; function parameter name: `function_param_name`; local variable name: `local_var_name`.
 
@@ -91,30 +91,30 @@ To ensure code quality, the contributor is required to add unit test cases for t
 #### 4.1 Unit Tests for Models
 
 1. Find the test case definition file corresponding to the task of the model in `tests/rs_models/`. For example, the change detection task corresponds to `tests/rs_models/test_cd_models.py`.
-2. Define a test class for the new model that inherits from `Test{task name}Model` and sets its `MODEL_CLASS` property to the new model, following the example already in the file.
+2. Define a test class for the new model that inherits from `Test{task name}Model` and sets its `MODEL_CLASS` property to the new model, following the existing examples in the file.
 3. Override the new test class's `test_specs()` method. This method sets `self.specs` to a list with each item in the list as a dictionary, whose key-value pairs are used as configuration items for the constructor model. That is, each item in `self.specs` corresponds to a set of test cases, each of which tests the model constructed with a particular set of parameters.
 
-#### 4.2 Unit Tests for Data Preprocessing/Data Augmentation Functions and Operators
+#### 4.2 Unit Tests for Data Preprocessing / Data Augmentation Functions and Operators
 
-- If you contribute an data preprocessing/augmentation operator (inherited from `paddlers.transforms.operators.Transform`), all the necessary input parameters to construct the operator have default values, and the operator can handle any task and arbitrary number of bands, then you need to add a new method to the `TestTransform` class in the `tests/transforms/test_operators.py`, mimicking the `test_Resize()` or `test_RandomFlipOrRotate()` methods.
-- If you contribute an operator that only supports processing for a specific task or has requirements for the number of bands in the input data, bind the operator with `_InputFilter` in `OP2FILTER`.
-- If you contribute a data preprocessing/data augmentation function (i.e. `paddlers/transforms/functions.py`), add a test class in `tests/transforms/test_functions.py` mimicking the existing example.
+- If you are contributing a data preprocessing / augmentation operator (inherited from `paddlers.transforms.operators.Transform`), all the necessary input parameters to construct the operator have default values, and the operator can handle any task and arbitrary number of bands, then you need to add a new method to the `TestTransform` class in the `tests/transforms/test_operators.py`, mimicking the `test_Resize()` or `test_RandomFlipOrRotate()` methods.
+- If you are contributing an operator that only supports processing for a specific task or has requirements for the number of bands in the input data, bind the operator with `_InputFilter` in `OP2FILTER`.
+- If you are contributing a data preprocessing / data augmentation function (i.e. `paddlers/transforms/functions.py`), add a test class in `tests/transforms/test_functions.py` mimicking the existing example.
 
 #### 4.3 Unit Tests for Tools
 
 1. Create a new file in the `tests/tools/` directory and name it `test_{tool name}.py`.
 2. Write the test case in the newly created script.
 
-#### 4.4 Execute the Test
+#### 4.4 Execute the Tests
 
-After adding the test cases, you need to execute all of the tests in their entirety (because the new code may affect the original code of the project and make some of the functionality not work properly). Enter the following command:
+After adding the test cases, you need to execute all tests in full. Run the following commands:
 
 ```bash
 cd tests
 bash run_tests.sh
 ```
 
-This process can be time-consuming and requires patience. If some of the test cases do not pass, modify them based on the error message until all of them pass.
+This process can be time-consuming and requires patience. If some of the test cases fail, modify them based on the error message until all of them pass.
 
 Run the following script to obtain coverage information:
 

+ 1 - 1
docs/data/dataset_cn.md

@@ -18,4 +18,4 @@ PaddleRS搜集汇总了遥感领域常用的**开源**深度学习数据集,
 
 ## 特别贡献数据集
 
-* 2020年中国典型城市道路样本数据CHN6-CUG由中国地质大学[朱祺琪](http://grzy.cug.edu.cn/zhuqiqi)教授提供。相关介绍和下载方式请参考[此处](http://grzy.cug.edu.cn/zhuqiqi/zh_CN/yjgk/32368/content/1733.htm)。
+* 2020年中国典型城市道路样本数据CHN6-CUG由中国地质大学[朱祺琪](http://grzy.cug.edu.cn/zhuqiqi)教授提供。相关介绍和下载方式请参考[此处](http://grzy.cug.edu.cn/zhuqiqi/zh_CN/yjgk/32368/content/1733.htm)。

+ 3 - 3
docs/data/dataset_en.md

@@ -1,8 +1,8 @@
 [简体中文](dataset_cn.md) | English
 
-# Remote Sensing Open Source Dataset
+# Remote Sensing Open Source Datasets
 
-PaddleRS has collected and summarized the most commonly used **open source** deep learning data sets in the field of remote sensing, providing the following information for each data set: dataset description, image information, annotation information, source address, and AI Studio backup link. According to the task type, these data sets can be divided into **image classification, image segmentation, change detection, object detection, object tracking, multi-label classification, image generation** and other types. Currently, the collected remote sensing datasets include:
+PaddleRS has collected and summarized the most commonly used **open source** datasets in the field of remote sensing, providing the following information for each data set: dataset description, image information, annotation information, source address, and AI Studio backup link. According to the task type, these data sets can be divided into **image classification, image segmentation, change detection, object detection, object tracking, multi-label classification, image generation**, and other types. Currently, the collected remote sensing datasets include:
 
 * 32 image classification datasets;
 * 40 object detection datasets;
@@ -18,4 +18,4 @@ Visit [Remote Sensing Data Set Summary](./dataset_summary_en.md) for more inform
 
 ## Dataset of Special Contribution
 
-* Sample data of typical urban roads in China of 2020(CHN6-CUG), provided by Professor [Qiqi Zhu](http://grzy.cug.edu.cn/zhuqiqi), China University of Geosciences. Please refer to [here](http://grzy.cug.edu.cn/zhuqiqi/zh_CN/yjgk/32368/content/1733.htm) for more information and download information.
+* The CHN6-CUG dataset is provided by Professor [Qiqi Zhu](http://grzy.cug.edu.cn/zhuqiqi) from China University of Geosciences. Please refer to [this website](http://grzy.cug.edu.cn/zhuqiqi/zh_CN/yjgk/32368/content/1733.htm) for more information.

+ 1 - 1
docs/data/rs_data_cn.md

@@ -31,7 +31,7 @@ L = gain * DN + bias \\
 \rho = \pi Ld^{2}_{s}/(E_{0}\cos{\theta})
 $$
 
-电磁波谱是人类根据电磁波的波长或频率、波数、能量等的大小顺序进行排列的成果。在电磁波谱中人眼只能感受到一个很小的波段范围,这个范围被称为可见光,波长范围在0.38-0.76μm。这是因为我们的视觉进化为在太阳发出最多光的地方最敏感,并且广泛地局限于构成我们所谓的红色、绿色和蓝色的波长。但卫星传感器可以感知范围更广的电磁频谱,这使得我们能够与借助传感器感知更多的频谱范围。
+电磁波谱是人类根据电磁波的波长或频率、波数、能量等的大小顺序进行排列的成果。在电磁波谱中人眼只能感受到一个很小的波段范围,这个范围被称为可见光,波长范围在0.38-0.76μm。这是因为我们的视觉系统进化为在太阳发出最多光的地方最敏感,并且广泛地局限于构成我们所谓的红色、绿色和蓝色的波长。但卫星传感器可以感知范围更广的电磁频谱,这使得我们能够与借助传感器感知更多的频谱范围。
 
 ![band](../images/band.jpg)
 

+ 27 - 27
docs/data/rs_data_en.md

@@ -2,48 +2,48 @@
 
 # Introduction to Remote Sensing Data
 
-## 1 Definition of Remote Sensing and Remote Sensing Images
+## 1 Concepts of Remote Sensing and Remote Sensing Images
 
-In a broad sense, remote sensing refers to "remote perception", that is, the remote detection and perception of objects or natural phenomena without direct contact. Remote sensing in the narrow sense generally refers to electromagnetic wave remote sensing technology, that is, the process of detecting electromagnetic wave reflection characteristics by using sensors on a certain platform (such as aircraft or satellite) and extracting information from it. The image data from this process is known as remote sensing imagery and generally includes satellite and aerial imagery. Remote sensing data are widely used in GIS tasks such as spatial analysis, as well as computer vision (CV) fields including scene classification, image segmentation and object detection.
+In a broad sense, remote sensing refers to *remote perception*, that is, the remote detection and perception of objects or natural phenomena without direct contact. Remote sensing in the narrow sense generally refers to electromagnetic wave remote sensing technology, that is, the process of detecting electromagnetic wave reflection characteristics by using sensors on a certain platform (such as aircraft or satellite) and extracting information from it. The image data from this process is known as remote sensing imagery, which generally includes satellite and aerial imagery. Remote sensing data are widely used in GIS tasks such as spatial analysis, as well as computer vision (CV) fields including scene classification, image segmentation, and object detection.
 
-Compared with aerial images, satellite images cover a wider area, so they are used more widely. Common satellite images may be taken by commercial satellites or come from open databases of agencies such as NASA and ESA.
+Compared with aerial images, satellite images cover a wider area, so they are used more broadly. Satellite images may be taken by commercial satellites or come from open databases of agencies such as NASA and ESA.
 
 ## 2 Characteristics of Remote Sensing Images
 
-Remote sensing technology has the characteristics of macroscopic, multi-band, periodicity and economic. Macroscopic refers to that the higher the remote sensing platform is, the wider the perspective will be, and the wider the ground can be synchronously detected. The multi-band property means that the sensor can detect and record information in different bands such as ultraviolet, visible light, near infrared and microwave. Periodicity means that the remote sensing satellite has the characteristic of acquiring images repeatedly in a certain period, which can carry out repeated observation of the same area in a short time. Economic means that remote sensing technology can be used as a way to obtain large area of surface information without spending too much manpower and material resources.
+Remote sensing technology has the characteristics of macroscopic, multi-band, periodic, and economic. Macroscopic refers to that the higher the remote sensing platform is, the wider the perspective will be, and the wider the ground can be detected. The multi-band property means that the sensor can detect and record information in different bands, such as ultraviolet, visible light, near infrared, and microwave. Periodic means that the remote sensing satellite has the characteristic of acquiring images repeatedly in a certain period, which can carry out repeated observation of the same area in a short time. Economic means that remote sensing technology can be used as a way to obtain information on large area of ground surface without spending too much manpower and material resources.
 
-The characteristics of remote sensing technology determine that remote sensing image has the following characteristics:
+The characteristics of remote sensing technology determine that the remote sensing image has the following characteristics:
 
 1. Large scale. A remote sensing image can cover a vast surface area.
-2. Multispectral. Compared with natural images, remote sensing images often have a larger number of bands.
-3. Rich source. Different sensors, different satellites can provide a variety of data sources.
+2. Multi-spectral. Compared with natural images, remote sensing images often have a larger number of bands.
+3. Rich data sources. Different sensors and different satellites can provide a variety of data sources.
 
-## 3 Definition of Raster Image and Imaging Principle of Remote Sensing Image
+## 3 Concept of Raster Image and Imaging Principle of Remote Sensing Image
 
-In order to introduce the imaging principle of remote sensing image, the concept of raster should be introduced first. Raster is a pixel-based data format that effectively represents continuous surfaces. The information in the raster is stored in a grid structure, and each information cell or pixel has the same size and shape, but different values. Digital photographs, orthophoto images and satellite images can all be stored in this format.
+In order to introduce the imaging principle of remote sensing image, the concept of raster should be introduced first. Raster is a pixel-based data format that effectively represents continuous surfaces. The information in the raster is stored in a grid structure, and each information unit or pixel has the same size and shape, but different values. Digital photographs, orthophoto images and satellite images can all be stored in this format.
 
-Raster formats are ideal for analysis that concentrates on spatial and temporal changes because each data value has a grid-based accessible location. This allows us to access the same geographic location in two or more different grids and compare their values.
+Raster formats are ideal for analysis that concentrates on spatial and temporal changes, because each data value has a grid-based accessible location. This allows us to access the same geographic location in two or more different grids and compare their values.
 
-When the earth observation satellite takes a picture, the sensor will record the DN (Digital Number) value of different wavelength electromagnetic wave in the grid pixel. Through DN, the irradiance and reflectance of ground objects can be inversely calculated. The relationship between them is shown in the following formula, where $gain$ and $bias$ refer to the gain and offset of the sensor respectively; $L$ is irradiance, also known as radiant brightness value; $\rho$ is the reflectance of ground objects; $d_{s}$, $E_{0}$ and $\theta$ respectively represent the distance between solar and earth astronomical units, solar irradiance and solar zenith angle.
+When the earth observation satellite takes a picture, the sensor will record the DN (Digital Number) value of different wavelength electromagnetic wave in the grid pixel. Through DN, the irradiance and reflectance of ground objects can be inversely calculated. The relationship between them is shown in the following formula, where $gain$ and $bias$ refer to the gain and offset of the sensor respectively; $L$ is irradiance, also known as radiant brightness value; $\rho$ is the reflectance of ground objects; $d_{s}$, $E_{0}$ and $\theta$ respectively represent the distance between solar and earth astronomical units, solar irradiance, and solar zenith angle.
 
 $$
 L = gain * DN + bias \\
 \rho = \pi Ld^{2}_{s}/(E_{0}\cos{\theta})
 $$
 
-The electromagnetic spectrum is the result of human beings according to the order of wave length or frequency, wave number, energy, etc. The human eye perceives only a small range of wavelengths in the electromagnetic spectrum, known as visible light, in the range of 0.38 to 0.76μm. That's because our vision evolved to be most sensitive where the sun emits the most light, and is broadly limited to the wavelengths that make up what we call red, green, and blue. But satellite sensors can sense a much wider range of the electromagnetic spectrum, which allows us to sense a much wider range of the spectrum with the help of sensors.
+The electromagnetic spectrum is the result of human beings according to the order of wave length or frequency, wave number, energy, etc. The human eye perceives only a small range of wavelengths in the electromagnetic spectrum, known as visible light, in the range of 0.38-0.76μm. This is because our vision system evolved to be most sensitive where the sun emits the most light, and is broadly limited to the wavelengths that make up what we call red, green, and blue. In contrast, satellite sensors can sense a much wider range of the electromagnetic spectrum, which allows us to sense a much wider range of the spectrum with the help of sensors.
 
 ![band](../images/band.jpg)
 
-The electromagnetic spectrum is so wide that it is impractical to use a single sensor to collect information at all wavelengths at once. In practice, different sensors give priority to collecting information from different wavelengths of the spectrum. Each part of the spectrum captured and classified by the sensor is classified as an information strip. The tape varies in size and can be compiled into different types of composite images, each emphasizing different physical properties. At the same time, most remote sensing images are 16-bit images, different from the traditional 8-bit images, which can represent finer spectral information.
+The electromagnetic spectrum is so wide that it is not possible to use a single sensor to collect information at all wavelengths at once. In practice, different sensors give priority to collecting information from different wavelengths of the spectrum. Each part of the spectrum captured and classified by the sensor is categorized as an information strip. The tape varies in size and can be compiled into different types of composite images, each emphasizing different physical properties. At the same time, most remote sensing images are 16-bit images, different from the traditional 8-bit images, which can represent finer spectral information.
 
-## 4 Classification of Remote Sensing Images
+## 4 Types of Remote Sensing Images
 
-Remote sensing image has the characteristics of wide coverage area, large number of bands and rich sources, and its classification is also very diverse. For example, remote sensing image can be divided into low resolution remote sensing image, medium resolution remote sensing image and high resolution remote sensing image according to spatial resolution. According to the number of bands, it can be divided into multi-spectral image, hyperspectral image, panchromatic image and other types. This document is intended to provide a quick guide for developers who do not have a background in remote sensing. Therefore, only a few common types of remote sensing images are described.
+Remote sensing images have the characteristics of wide coverage area, large number of bands and rich sources. There are various types of remote sensing images. For example, remote sensing images can be divided into low-resolution remote sensing images, medium-resolution remote sensing images, and high-resolution remote sensing images, according to spatial resolution. According to the number of bands, it can be divided into multi-spectral image, hyperspectral image, panchromatic image, and other types. This document is intended to provide a quick guide for developers who do not have a background in remote sensing. Therefore, only a few common types of remote sensing images are introduced.
 
 ### 4.1 RGB Image
 
-RGB images are similar to common natural images in daily life. The features displayed in RGB images are also in line with human visual common sense (for example, trees are green, cement is gray, etc.), and the three channels represent red, green and blue respectively. The figure below shows an RGB remote sensing image:
+RGB images are similar to common natural images in daily life. The features displayed in RGB images are also in line with human visual common sense (for example, trees are green, cement is gray, etc.), and the three channels represent red, green, and blue, respectively. The figure below shows an RGB remote sensing image:
 
 ![rgb](../images/rgb.jpg)
 
@@ -51,21 +51,21 @@ Since the data processing pipelines of most CV tasks are designed based on natur
 
 ### 4.2 MSI/HSI Image
 
-MSI (Multispectral Image) and HSI (Hyperspectral Image) usually consist of several to hundreds of bands. The two are distinguished by different spectral resolution (*spectral resolution refers to the value of a specific wavelength range in the electromagnetic spectrum that can be recorded by the sensor; the wider the wavelength range, the lower the spectral resolution*). Usually the spectral resolution in the order of 1/10 of the wavelength is called multispectral. MSI has fewer bands, wider band, and higher spatial resolution, while HSI has more bands, narrower bands, and higher spectral resolution. However, HSI has more bands, narrower bands and higher spectral resolution.
+MSI (multi-spectral image) and HSI (hyperspectral image) usually consist of several to hundreds of bands. The two are distinguished by different spectral resolution (*spectral resolution refers to the value of a specific wavelength range in the electromagnetic spectrum that can be recorded by the sensor; the wider the wavelength range, the lower the spectral resolution*). Usually the spectral resolution in the order of 1/10 of the wavelength is called multispectral. MSI has fewer bands, wider band, and higher spatial resolution, while HSI has more bands, narrower bands, and higher spectral resolution. However, HSI has more bands, narrower bands, and higher spectral resolution.
 
-In practice, some specific bands of MSI/HSI are often selected according to application requirements: for example, the transmittance of mid-infrared band is 60%-70%, including ground object reflection and emission spectrum, which can be used to detect high temperature targets such as fire. The red-edge band (*the point where the reflectance of green plants increases fastest between 0.67 and 0.76μm, and is also the inflection point of the first derivative spectrum in this region*) is a sensitive band indicating the growth status of green plants. It can effectively monitor the growth status of vegetation and be used to study plant nutrients, health monitoring, vegetation identification, physiological and biochemical parameters and other information.
+In practice, some specific bands of MSI/HSI are often selected according to application demands: For example, the transmittance of mid-infrared band is 60%-70%, including ground object reflection and emission spectrum, which can be used to detect high temperature targets such as fire. The red-edge band (*the point where the reflectance of green plants increases fastest between 0.67μm and 0.76μm, and is also the inflection point of the first derivative spectrum in this region*) is a sensitive band indicating the growth status of green plants. It can effectively monitor the growth status of vegetation and be used to study plant nutrients, health monitoring, vegetation identification, physiological, and biochemical parameters, etc.
 
-The following takes the image of Beijing Daxing Airport taken by Tiangong-1 hyperspectral imager as an example to briefly introduce the concepts of band combination, spectral curve and band selection commonly used in MSI/HSI processing. In the hyperspectral data set of Tiangong-1, bands with low signal-to-noise ratio and information entropy were eliminated based on the evaluation results of band signal-to-noise ratio and information entropy, and some bands were eliminated based on the actual visual results of the image. A total of 54 visible near-infrared spectrum segments, 52 short-wave infrared spectrum segments and the whole chromatographic segment data were retained.
+The following gives an example to briefly introduce the concepts of band combination, spectral curve, and band selection commonly used in MSI/HSI processing, based on an hyperspectral image of Beijing Daxing Airport taken by Tiangong-1. In this image, bands with low signal-to-noise ratio and information entropy were eliminated based on the evaluation results, and some bands were eliminated based on the actual visual results of the image. A total of 54 visible near-infrared spectrum segments, 52 short-wave infrared spectrum segments, and the whole chromatographic segment data were retained.
 
 **Band Combination**
 
-Band combination refers to the result obtained by selecting three band data in MSI/HSI to combine and replace the three RGB channels, which is called the color graph (*The result synthesized using the real RGB three bands is called the true color graph, otherwise it is called the false color graph*). The combination of different bands can highlight different features of ground objects. The following figure shows the visual effects of several different combinations:
+Band combination refers to the result obtained by selecting three bands in MSI/HSI to replace the three RGB channels. *If the result is synthesized using the real RGB bands, it is called a true color image. Otherwise, it is called a false color image.* The combination of different bands can highlight different features of ground objects. The following figure shows the visual effects of several different combinations:
 
 ![Figure 3](../images/band_combination.jpg)
 
 **Spectral Curve Interpretation**
 
-Spectral information can often reflect the features of ground objects, and different bands reflect different features of ground objects. Spectral curves can be drawn by taking the wavelength or frequency of electromagnetic wave as the horizontal axis and the reflectance as the vertical axis. Taking the spectral curve of vegetation as an example, as shown in the figure below, the reflectance of vegetation is greater than 40% in the band of 0.8μm, which is significantly greater than that of about 10% in the band of 0.6μm, so more radiation energy is reflected back during imaging. Reflected in the image, the vegetation appears brighter in the 0.8μm image.
+Spectral information can often reflect the features of ground objects, and different bands reflect different features of ground objects. Spectral curves can be drawn by taking the wavelength or frequency of electromagnetic wave as the horizontal axis and the reflectance as the vertical axis. Taking the spectral curve of vegetation as an example, as shown in the figure below, the reflectance of vegetation is greater than 40% in the band of 0.8μm, which is significantly greater than that of about 10% in the band of 0.6μm, so more radiation energy is reflected during imaging. As shown in the image, the vegetation appears brighter in the 0.8μm image.
 
 ![band_mean](../images/band_mean.jpg)
 
@@ -75,29 +75,29 @@ MSI/HSI may contain a larger number of bands. For one thing, not all bands are s
 
 ### 4.3 SAR Image
 
-Synthetic Aperture Radar (SAR) refers to active side-looking radar systems. The imaging geometry of SAR belongs to the slant projection type, so SAR image and optical image have great differences in imaging mechanism, geometric features, radiation features and other aspects.
+Synthetic Aperture Radar (SAR) refers to active side-looking radar systems. The imaging geometry of SAR belongs to the slant projection type, so SAR images and optical images have great differences in imaging mechanism, geometric features, radiation features, and other aspects.
 
 The information of different bands in optical images comes from the reflected energy of electromagnetic waves of different wavelengths, while SAR images record echo information of different polarizations (*that is, the vibration direction of electromagnetic wave transmission and reception*) in binary and complex forms. Based on the recorded complex data, the original SAR image can be transformed to extract the corresponding amplitude and phase information. Human beings cannot directly distinguish the phase information, but they can intuitively perceive the amplitude information, and intensity images can be obtained by using the amplitude information, as shown in the figure below:
 
 ![sar](../images/sar.jpg)
 
-Due to the special imaging mechanism of SAR image, its resolution is relatively low, and the signal-to-noise ratio is also low, so the amplitude information contained in SAR image is far from the imaging level of optical image. This is why SAR images are rarely used in the CV field. At present, SAR image is mainly used for settlement detection inversion and 3D reconstruction based on phase information. It is worth mentioning that SAR has its unique advantages in some application scenarios due to its long wavelength and certain cloud and surface penetration ability.
+Due to the special imaging mechanism of SAR images, its resolution is relatively low, and the signal-to-noise ratio is also low, so the amplitude information contained in SAR images is much less than the optical image. This is why SAR images are rarely used in the CV field. It is worth mentioning that SAR has its unique advantages in some application scenarios due to its long wavelength and cloud and surface penetration ability.
 
 ### 4.4 RGBD Image
 
-The difference between RGBD images and RGB images is that there is an extra D channel in RGBD images, namely the depth. Depth images are similar to grayscale images, except that each pixel value represents the actual distance of the sensor from the object. Generally, RGB data and depth data in RGBD images are registered with each other. Depth image provides height information that RGB image does not have, and can distinguish some ground objects with similar spectral characteristics in some downstream tasks.
+The difference between RGBD images and RGB images is that there is an extra D channel in RGBD images, namely the depth. Depth images are similar to grayscale images, except that each pixel value represents the actual distance of the sensor from the object. Generally, RGB data and depth data in RGBD images are registered with each other. Depth images provide height information that RGB images do not have, and can distinguish some ground objects with similar spectral characteristics in downstream tasks.
 
-## 5 The Preprocessing of Remote Sensing Image
+## 5 Preprocessing of Remote Sensing Image
 
 Compared with natural images, the preprocessing of remote sensing images is rather complicated. Specifically, it can be divided into the following steps:
 
 1. **Radiometric Calibration**: The DN is converted into radiation brightness value or reflectivity and other physical quantities.
 2. **Atmospheric Correction**: The radiation error caused by atmospheric influence is eliminated and the real surface reflectance of surface objects is retrieved. This step together with radiometric calibration is called **Radiometric Correction**.
 3. **Orthographic Correction**: The oblique correction and projection difference correction were carried out at the same time, and the image was resampled to orthophoto.
-4. **Image Registration**: Match and overlay two or more images taken at different times, from different sensors (imaging equipment) or under different conditions (weather, illumination, camera position and angle, etc.).
+4. **Image Registration**: Match two or more images taken at different times, from different sensors (imaging equipments) or under different conditions (weather, illumination, camera position, angle, etc.).
 5. **Image Fusion**: The image data of the same object collected by multiple source channels are synthesized into high quality image.
 6. **Image Clipping**: The large remote sensing image was cut into small pieces to extract the region of interest.
-7. **Define Projection**: Define projection information (a geographic coordinate system) on the data.
+7. **Define Projection**: Define a geographic coordinate system on the data.
 
 It should be noted that in practical application, the above steps are not all necessary, and some of them can be performed selectively according to needs.
 

+ 23 - 23
docs/dev/dev_guide_en.md

@@ -4,9 +4,9 @@
 
 ## 0 Catalog
 
-- [Add Remote Sensing Dedicated Model](#1-add-remote-sensing-dedicated-models)
+- [Add Remote Sensing Dedicated Models](#1-add-remote-sensing-dedicated-models)
 
-- [Add Data Preprocessing/Data Augmentation Function or Operator](#2-add-data-preprocessing/data-augmentation-functions-or-operators)
+- [Add Data Preprocessing / Data Augmentation Functions or Operators](#2-add-data-preprocessing--data-augmentation-functions-or-operators)
 
 - [Add Remote Sensing Image Processing Tools](#3-add-remote-sensing-image-processing-tools)
 
@@ -24,16 +24,16 @@ First, find the subdirectory (package) corresponding to the task in `paddlers/rs
 
 Create a new file in the subdirectory and name it `{model name lowercase}.py`.  Write the complete model definition in the file.
 
-The new model must be a subclass of `paddle.nn.Layer`. For the tasks of image segmentation, object detection, scene classification and image restoration, relevant specifications formulated in development kit [PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg), [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection),[PaddleClas](https://github.com/PaddlePaddle/PaddleClas), and [PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN) should be followed respectively. **For change detection, scene classification and image segmentation tasks, the `num_classes` argument must be passed in the model construction to specify the number of output categories. For image restoration tasks, the `rs_factor` argument must be passed in during model construction to specify the super resolution scaling ratio (for non-super resolution models, this argument is set to `None`).** For the change detection task, the model definition should follow the same specification as the segmentation model, but with the following differences:
+The new model must be a subclass of `paddle.nn.Layer`. For the tasks of image segmentation, object detection, scene classification, and image restoration, relevant specifications formulated in the development kits [PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg), [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection),[PaddleClas](https://github.com/PaddlePaddle/PaddleClas), and [PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN) should be followed respectively. **For change detection, scene classification and image segmentation tasks, the `num_classes` argument must be passed in the model construction to specify the number of output categories. For image restoration tasks, the `rs_factor` argument must be passed in during model construction to specify the super resolution scaling factor (for non-super-resolution models, this argument is set to `None`).** For the change detection task, the model definition should follow the same specifications as the segmentation model, but with the following differences:
 
-- The `forward()` method accepts three input parameters, namely `self`, `t1` and `t2`, where `t1` and `t2` represent the input image of the first and second two phases respectively.
-- For a multi-task change detection model (for example, the model outputs both change detection results and building extraction results of two phases), the class attribute `USE_MULTITASK_DECODER` needs to be specified as `True`. Also in the `OUT_TYPES` attribute set the label type for each element in the list of model forward output. Refer to the definition of `ChangeStar` model.
+- The `forward()` method accepts three input parameters, namely `self`, `t1`, and `t2`, where `t1` and `t2` represent the input images of the first and second temporal phases, respectively.
+- For a multi-task change detection model (for example, the model outputs both change detection results and building extraction results of two temporal phases), the class attribute `USE_MULTITASK_DECODER` needs to be specified as `True`. Also in the `OUT_TYPES` attribute set the label type for each element in the model forward output. See the definition of `ChangeStar` model as an example.
 
-Note that if a common component exists in a subdirectory. For example, contents in `paddlers/rs_models/cd/layers`, `paddlers/rs_models/cd/backbones` and `paddlers/rs_models/seg/layers` should be reused as much as possible.
+Note that if common components exist in a subdirectory (e.g., contents in `paddlers/rs_models/cd/layers`, `paddlers/rs_models/cd/backbones` and `paddlers/rs_models/seg/layers`), they should be reused as much as possible.
 
 ### 1.2 Add Docstrings
 
-You have to add a docstring to the new model, with the original references and links in it (you don't have to be strict about the reference format, but you want to be as consistent as possible with the other models you already have for the task). For detailed annotation specifications, refer to the [Code Annotation Specification](docstring_en.md). An example is as follows:
+You have to add a docstring to the new model, with the references and links to the original paper (you don't have to be strict about the reference format, but consistency between different models of the same task is encouraged). For detailed annotation specifications, please refer to the [document](docstring_en.md). An example is as follows:
 
 ```python
 """
@@ -61,35 +61,35 @@ Args:
 
 Please follow these steps:
 
-1. In `paddlers/rs_models/{task subdirectories}`'s `__init__.py`, add `from ... import`, you can refer existing examples in the document.
+1. In `__init__.py` of `paddlers/rs_models/{task subdirectories}`, add `from ... import`.
 
 2. Locate the trainer definition file corresponding to the task in the `paddlers/tasks` directory (for example, the change detection task corresponds to `paddlers/tasks/change_detector.py`).
 
-3. Appends the new trainer definition to the end of the file. The trainer inherits from the related base class (such as `BaseChangeDetector`), overriding the `__init__()` method, and overriding other methods as needed. The trainer's `__init__()` method is written with the following requirements:
-    - For tasks such as change detection, scene classification, object detection and image segmentation, the first input parameter of `__init__()` method is `num_classes`, which represents the number of model output classes. For the tasks of change detection, scene classification and image segmentation, the second input parameter is `use_mixed_loss`, indicating whether the user uses the default definition of mixing loss. The third input parameter is `losses`, which represents the loss function used in training. For the image restoration task, the first parameter is `losses`, meaning the same as above; the second parameter is `rs_factor`, which represents the super resolution scaling ratio; the third parameter is `min_max`, which represents the numeric range of the input and output images.
-    - All input parameters of `__init__()` must have default values, and **in this case, the model receives 3-channel RGB input.**
+3. Appends the new trainer definition to the end of the file. The trainer inherits from the related base class (such as `BaseChangeDetector`). Override `__init__()` and other methods according to your needs. The trainer's `__init__()` method is written with the following requirements:
+    - For tasks such as change detection, scene classification, object detection, and image segmentation, the first input parameter of `__init__()` method is `num_classes`, which represents the number of model output classes. For the tasks of change detection, scene classification, and image segmentation, the second input parameter is `use_mixed_loss`, indicating whether to use a mixing loss. The third input parameter is `losses`, which represents the loss function used in training. For the image restoration task, the first parameter is `losses`, meaning the same as above; the second parameter is `rs_factor`, which represents the super resolution scaling factor; the third parameter is `min_max`, which represents the numeric range of the input and output images.
+    - All input parameters of `__init__()` must have default values, and **in the default case, the model receives 3-channel RGB input.**
     - In `__init__()` you need to update the `params` dictionary, whose key-value pairs will be used as input parameters during model construction.
 
 4. Add the class name of the new trainer to the global variable `__all__`.
 
-It should be noted that for the image restoration task, the forward and backward logic of the model are implemented in the trainer definition. For GAN and other models that need to use multiple networks, please refer to the following specifications for the preparation of the trainer:
-- Override the `build_net()` method to maintain all networks using the `GANAdapter`. The `GANAdapter` object takes two lists as input when it is constructed: the first list contains all generators, where the first element is the main generator; the second list contains all discriminators.
+It should be noted that for the image restoration task, the forward and backward of the model are implemented in the trainer definition. For GAN and other models that need to use multiple networks, please refer to the following specifications for the preparation of the trainer:
+- Override the `build_net()` method to maintain all networks using `GANAdapter`. An `GANAdapter` object takes two lists as input when it is constructed: The first list contains all generators, where the first element is the main generator; the second list contains all discriminators.
 - Override the `default_loss()` method to build the loss function. If more than one loss function is required in the training process, it is recommended to organize in the form of a dictionary.
-- Override the `default_optimizer()` method to build one or more optimizers. When `build_net()` returns a value of type `GANAdapter`, `parameters` is a dictionary. Where, `parameters['params_g']` is a list containing the state dict of the various generators in order; `parameters['params_d']` is a list that contains the state dict of the individual discriminators in order. If you build more than one optimizer, you should use the `OptimizerAdapter` wrapper on return.
-- Override the `run_gan()` method that accepts four parameters: `net`, `inputs`, `mode`, and `gan_mode` for one of the subtasks in the training process, e.g. forward calculation of generator, forward calculation of discriminator, etc.
-- Rewrite `train_step()` method to write the specific logic of one iteration during model training. The usual approach is to call `run_gan()` multiple times, constructing different `inputs` to work in different `gan_mode` as needed each time, extracting useful fields (e.g. losses) from the `outputs` dictionary returned each time and summarizing them into the final result.
+- Override the `default_optimizer()` method to build one or more optimizers. When `build_net()` returns a value of type `GANAdapter`, `parameters` is a dictionary, where `parameters['params_g']` is a list containing the state dicts of the various generators in order; `parameters['params_d']` is a list that contains the state dicts of the individual discriminators in order. If you build more than one optimizer, you should use the `OptimizerAdapter` wrapper on return.
+- Override the `run_gan()` method that accepts four parameters: `net`, `inputs`, `mode`, and `gan_mode`, for one of the subtasks in the training process, e.g. forward of generator, forward of discriminator, etc.
+- Override `train_step()` method to define how a single training step goes. Usually, in a training step, we call `run_gan()` multiple times with different `inputs`and `gan_mode`, extract useful fields (e.g. losses) from the `outputs` dictionary returned each time, and summarize them into the final result.
 
-See `ESRGAN` for specific examples of GAN trainers.
+See `ESRGAN` for a specific example of GAN trainers.
 
-## 2 Add Data Preprocessing/Data Augmentation Functions or Operators
+## 2 Add Data Preprocessing / Data Augmentation Functions or Operators
 
-### 2.1 Add Data Preprocessing/Data Augmentation Functions
+### 2.1 Add Data Preprocessing / Data Augmentation Functions
 
 Define new function in `paddlers/transforms/functions.py`. If the function needs to be exposed and made available to users, you must add a docstring to it.
 
-### 2.2 Add Data Preprocessing/Data Augmentation Operators
+### 2.2 Add Data Preprocessing / Data Augmentation Operators
 
-Define new operators in `paddlers/transforms/operators.py`, all operators inherit from `paddlers.transforms.Transform`. The operator's `apply()` method receives a dictionary `sample` as input, takes out the related objects stored in it, and makes in-place modifications to the dictionary after processing, and finally returns the modified dictionary. Only in rare cases do we need to override the `apply()` method when defining an operator. In most cases, you just need to override the `apply_im()`, `apply_mask()`, `apply_bbox()`, and `apply_segm()` methods to handle the image, split label, target box, and target polygon, respectively.
+Define new operators in `paddlers/transforms/operators.py`. All operators inherit from `paddlers.transforms.Transform`. The operator's `apply()` method receives a dictionary `sample` as input, fetches objects stored in it, makes in-place modifications to the dictionary after processing, and finally returns the modified dictionary. Only in rare cases do you need to override the `apply()` method when defining an operator. In most cases, you just need to override the `apply_im()`, `apply_mask()`, `apply_bbox()`, and `apply_segm()` methods to handle the images, segmentation labels, bounding boxes, and target polygons, respectively.
 
 If the operator has a complicated implementation, it is recommended to define functions in `paddlers/transforms/functions.py` and call them in `apply*()` of operators.
 
@@ -97,6 +97,6 @@ After writing the implementation of the operator, **you must write docstring and
 
 ## 3 Add Remote Sensing Image Processing Tools
 
-Remote sensing image processing tools are stored in the `tools/` directory. Each tool should be a relatively independent script, independent of the contents of the `paddlers/` directory, which can be executed by the user without installing PaddleRS.
+Remote sensing image processing tools are stored in the `tools/` directory. Each tool should be a relatively independent script, independent of the contents in the `paddlers/` directory, which can be executed by the user without installing PaddleRS.
 
-When writing the script, use the Python standard library `argparse` to process the command-line arguments entered by the user and execute the specific logic in the `if __name__ == '__main__':` code block. If you have multiple tools that use the same function or class, define these common components in `tools/utils`.
+When writing the script, use the Python standard library `argparse` to process the command-line arguments. Also, we suggest using the `if __name__ == '__main__':` code block. If you have multiple tools that use the same function or class, please define these common components in `tools/utils`.

+ 1 - 1
docs/dev/docstring_cn.md

@@ -27,7 +27,7 @@
 目标是让用户能快速看懂。该模块又可以拆解为3个部分,功能叙述 + 计算公式 + 注解。
 
 - 功能叙述:描述该函数或类的具体功能。由于用户不一定具有相应背景知识,所以需要补充必要的细节。
-- (可选)计算公式:如有需要,给出函数的计算公式。公式建议以LaTex文法编写。
+- (可选)计算公式:如有需要,给出函数的计算公式。公式建议以LaTeX文法编写。
 - (可选)注解:如需要特殊说明,可以在该部分给出。
 
 示例:

+ 34 - 34
docs/dev/docstring_en.md

@@ -1,34 +1,34 @@
 [简体中文](docstring_cn.md) | English
 
-# PaddleRS Specification for Code Annotation
+# PaddleRS Specifications for Code Annotation
 
-## 1 Specification for Docstrings
+## 1 Specifications for Docstrings
 
-The docstring of the function consists of five modules:
+The docstring of the function consists of five parts:
 
 - Function description;
-- Function parameter;
-- (optional) Function return value;
-- (optional) exceptions that the function may throw;
-- (Optional) Use an example.
+- Function parameters;
+- (Optional) Function return value;
+- (Optional) Exceptions that the function may throw;
+- (Optional) Example on usage.
 
-Class docstring also consists of five modules:
+The docstring of a class also consists of five parts:
 
-- Class function description;
-- Instantiate parameters required by the class;
-- (optional) The object obtained by instantiating the class;
-- (optional) Exceptions that may be thrown when the class is instantiated;
-- (Optional) Use an example.
+- Class description;
+- Construction parameters of the class;
+- (Optional) Object obtained by instantiating the class;
+- (Optional) Exceptions that may be thrown during class instantiation;
+- (Optional) Example on usage.
 
-The specifications for each module are described in detail below.
+The specification for each part is described in detail below.
 
-### 1.1 Description on Functionality of Function/Class Function
+### 1.1 Description on Functionality of Function/Class
 
-The goal is for the user to understand it quickly. The module can be disassembled into 3 parts, function description + calculation formula + annotation.
+The first goal is to let users understand quickly. This part can be decomposed into 3 sections, major description + formulae + notes.
 
-- Function Description: Describes the specific functions of the function or class. Since the user does not necessarily have the background knowledge, the necessary details need to be added.
-- (Optional) Calculation formula: If necessary, provide the calculation formula of the function. Formulae are suggested to be written in LaTex grammar.
-- (Optional) Note: If special instructions are required, they can be given in this section.
+- Major description: Describe what the function/class is and what it does. Necessary details should be added for users that do not have the background knowledge.
+- (Optional) Formulae: If necessary, provide the mathematical formulae of the function. The formulae are suggested to be written in LaTeX grammar.
+- (Optional) Notes: If special instructions are required, they can be given in this section.
 
 Example:
 
@@ -43,18 +43,18 @@ Example:
 """
 ```
 
-### 1.2 Function Arguments/Class Construction Arguments
+### 1.2 Function Parameters / Class Construction Parameters
 
-Explain clearly the **type**, **meaning** and **default value** (if any) for each parameter.
+Explain clearly the **type**, **meaning**, and **default value** (if any) for each parameter.
 
 Note:
 
-- optional parameters to note `optional`, for example: `name (str|None, optinoal)`;
-- If a parameter has a variety of optional type, use `|` to separate;
+- Optional parameters should be marked `optional`, for example: `name (str|None, optinoal)`.
+- If a parameter has a variety of optional types, use `|` to separate.
 - A space should be left between the parameter name and the type.
-- A list or tuple containing an object of a certain type can be represented by `list[{type name}]` and `tuple[{type name}]`. For example, `list[int]` represents a list containing an element of type int. `tuple[int|float]` equivalent to `tuple[int]| tuple[float]`;
-- When using the description of `list[{type name}]` and `tuple[{type name}]`, the default assumption is that the list or tuple parameters are homogeneous (that is, all elements contained in the list or tuple have the same type). If the list or tuple parameters are heterogeneous, it needs to be explained in the literal description.
-- If the separated type is a simple type such as `int`, `Tensor`, etc., there is no need to add a space before and after the `|`. However, if it is multiple complex types such as `list[int]` and `tuple[float]`, a space should be added before and after the `|`.
+- A list or tuple containing an object of a certain type can be represented by `list[{type name}]` and `tuple[{type name}]`. For example, `list[int]` represents a list containing an element of type `int`. `tuple[int|float]` is equivalent to `tuple[int]| tuple[float]`.
+- When using the description of `list[{type name}]` and `tuple[{type name}]`, a default assumption is that the list or tuple parameters are homogeneous (that is, all elements contained in the list or tuple have the same type). If the list or tuple parameters are heterogeneous, it needs to be explained in the literal description.
+- If the separated type is a simple type such as `int`, `Tensor`, etc., there is no need to add a space before and after the `|`. However, if there are multiple complex types such as `list[int]` and `tuple[float]`, a space should be added before and after the `|`.
 - For parameters that have a default value, please explain why we use that default value, not just what the parameter is and what the default value is.
 
 Example:
@@ -69,7 +69,7 @@ Example:
 """
 ```
 
-### 1.3 Return Value/Construct Object
+### 1.3 Return Value/Object
 
 For a function return value, first describe the type of the return value (surrounded by `()`, with the same syntax as the parameter type description), and then explain the meaning of the return value. There is no need to specify the type of the object obtained by instantiating the class.
 
@@ -91,7 +91,7 @@ Example 2:
 """
 ```
 
-Example 3 (In the class definition):
+Example 3 (in class definition):
 
 ```python
 """
@@ -120,7 +120,7 @@ Provide as many examples as possible for various usage scenarios of the function
 
 Requirement: Users can run the script by copying the sample code. Note that the necessary `import` statements need to be added.
 
-Single example:
+Example of giving a single usage example:
 
 ```python
 """
@@ -145,7 +145,7 @@ Single example:
 """
 ```
 
-Multi-examples:
+Example of giving multiple usage examples:
 
 ```python
 """
@@ -188,16 +188,16 @@ Multi-examples:
 
 - Wording should be accurate, using vocabulary and expressions common in deep learning.
 - The sentences should be smooth and in line with English grammar.
-- The document should be consistent in the expression of the same thing, for example, avoid using label sometimes and ground truth sometimes.
+- The document should be consistent in the expression of the same thing. For example, avoid using *label* and *ground-truth* to refer to the same thing.
 
 ### 1.7 Other Points to Note
 
-- Different modules are separated by **1** blank lines.
+- Different parts are separated by **1** blank lines.
 - Pay attention to capitalization and punctuation rules in acoordance with English grammer.
 - Blank lines can be placed appropriately in the content of the code sample for a sense of hierarchy.
 - For the **input parameter name**, **the property or method of the input parameter**, and the **file path** that appear in the comment, surround it with \`.
-- Line breaks and indentation are required between each module's title/subtitle and the concrete content, and **1** blank lines should be inserted between the `Examples:` title and the sample code content.
-- Suspension indentation is required when a single paragraph description spans lines.
+- Line breaks and indentation are required between each part's title/subtitle and the content, and **1** blank lines should be inserted between the `Examples:` title and the sample code.
+- Hangling indentation is required when a single paragraph description spans lines.
 
 ## 2 Complete Docstring Example
 

+ 132 - 124
docs/intro/model_cons_params_cn.md

@@ -2,13 +2,13 @@
 
 # PaddleRS模型构造参数
 
-本文档详细介绍PaddleRS各模型训练器的构造参数,包括其参数名、参数类型、参数描述及默认值。
+本文档介绍PaddleRS各模型训练器的构造参数,包括其参数名、参数类型、参数描述及默认值。
 
 ## `BIT`
 
 基于PaddlePaddle实现的BIT模型。
 
-该模型的原始文章见于 H. Chen, et al., "Remote Sensing Image Change Detection With Transformers" (https://arxiv.org/abs/2103.00208)
+该模型的原始文章见于 H. Chen, et al., "Remote Sensing Image Change Detection With Transformers" (https://arxiv.org/abs/2103.00208).
 
 该实现采用预训练编码器,而非原始工作中随机初始化权重。
 
@@ -16,29 +16,28 @@
 |-------------------|------------------------------------------------------------------------|--------------|
 | `in_channels (int)` | 输入图像的通道数                                                               | `3`          |
 | `num_classes (int)`  | 目标类别数量                                                                 | `2`           |
-| `use_mixed_loss (bool, optional)` | 是否使用混合损失函数                                                             | `False`      |
-| `losses (list, optional)` | 损失函数列表                                                                 | `None`       |
-| `att_type (str, optional)` | 空间注意力类型,可选值为`'CBAM'`和`'BAM'`                                           | `'CBAM'`     |
-| `ds_factor (int, optional)` | 下采样因子                                                                  | `1`          |
-| `backbone (str, optional)` | 用作主干网络的 ResNet 型号。目前仅支持`'resnet18'`和`'resnet34'`                       | `'resnet18'` |
-| `n_stages (int, optional)` | 主干网络中使用的 ResNet 阶段数,应为`{3、4、5}`中的值                                     | `4`          |
-| `use_tokenizer (bool, optional)` | 是否使用可学习的 tokenizer                                                     | `True`       |
-| `token_len (int, optional)` | 输入 token 的长度                                                           | `4`          |
-| `pool_mode (str, optional)` | 当`'use_tokenizer'`设置为`False`时,获取输入 token 的池化策略。`'max'`表示全局最大池化,`'avg'`表示全局平均池化 | `'max'`      |
-| `pool_size (int, optional)` | 当`'use_tokenizer'`设置为`False`时,池化后的特征图的高度和宽度                             | `2`          |
-| `enc_with_pos (bool, optional)` | 是否将学习的位置嵌入到编码器的输入特征序列中                                                 | `True`       |
-| `enc_depth (int, optional)` | 编码器中使用的注意力块数                                                           | `1`          |
-| `enc_head_dim (int, optional)` | 每个编码器头的嵌入维度                                                            | `64`         |
-| `dec_depth (int, optional)` | 解码器中使用的注意力模块数量                                                         | `8`          |
-| `dec_head_dim (int, optional)` | 每个解码器头的嵌入维度                                                            | `8`          |
-
+| `use_mixed_loss (bool)` | 是否使用混合损失函数                                                             | `False`      |
+| `losses (list)` | 损失函数列表                                                                 | `None`       |
+| `att_type (str)` | 空间注意力类型,可选值为`'CBAM'`和`'BAM'`                                           | `'CBAM'`     |
+| `ds_factor (int)` | 下采样因子                                                                  | `1`          |
+| `backbone (str)` | 用作主干网络的 ResNet 型号。目前仅支持`'resnet18'`和`'resnet34'`                       | `'resnet18'` |
+| `n_stages (int)` | 主干网络中使用的 ResNet 阶段数,应为`{3、4、5}`中的值                                     | `4`          |
+| `use_tokenizer (bool)` | 是否使用可学习的 tokenizer                                                     | `True`       |
+| `token_len (int)` | 输入 token 的长度                                                           | `4`          |
+| `pool_mode (str)` | 当`'use_tokenizer'`设置为`False`时,获取输入 token 的池化策略。`'max'`表示全局最大池化,`'avg'`表示全局平均池化 | `'max'`      |
+| `pool_size (int)` | 当`'use_tokenizer'`设置为`False`时,池化后的特征图的高度和宽度                             | `2`          |
+| `enc_with_pos (bool)` | 是否将学习的位置嵌入到编码器的输入特征序列中                                                 | `True`       |
+| `enc_depth (int)` | 编码器中使用的注意力块数                                                           | `1`          |
+| `enc_head_dim (int)` | 每个编码器头的嵌入维度                                                            | `64`         |
+| `dec_depth (int)` | 解码器中使用的注意力模块数量                                                         | `8`          |
+| `dec_head_dim (int)` | 每个解码器头的嵌入维度                                                            | `8`          |
 
 
 ## `CDNet`
 
 该基于PaddlePaddle的CDNet实现。
 
-该模型的原始文章见于 Pablo F. Alcantarilla, et al., "Street-View Change Detection with Deconvolut ional Networks"(https://link.springer.com/article/10.1007/s10514-018-9734-5)
+该模型的原始文章见于 Pablo F. Alcantarilla, et al., "Street-View Change Detection with Deconvolut ional Networks"(https://link.springer.com/article/10.1007/s10514-018-9734-5).
 
 | 参数名                     | 描述                              | 默认值     |
 |-------------------------| --------------------------------- | ---------- |
@@ -52,7 +51,7 @@
 
 基于PaddlePaddle的ChangeFormer实现。
 
-该模型的原始文章见于 Wele Gedara Chaminda Bandara,Vishal M. Patel,“A TRANSFORMER-BASED SIAMESE NETWORK FOR CHANGE DETECTION”(https://arxiv.org/pdf/2201.01293.pdf)
+该模型的原始文章见于 Wele Gedara Chaminda Bandara,Vishal M. Patel,“A TRANSFORMER-BASED SIAMESE NETWORK FOR CHANGE DETECTION”(https://arxiv.org/pdf/2201.01293.pdf).
 
 | 参数名                         | 描述                        | 默认值       |
 |--------------------------------|---------------------------|--------------|
@@ -64,11 +63,11 @@
 | `embed_dim (int)`              | Transformer 编码器的隐藏层维度     | `256`        |
 
 
-## `ChangeStar_FarSeg`
+## `ChangeStar`
 
 基于PaddlePaddle实现的ChangeStar模型,其使用FarSeg编码器。
 
-该模型的原始文章见于 Z. Zheng, et al., "Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery"(https://arxiv.org/abs/2108.07002)
+该模型的原始文章见于 Z. Zheng, et al., "Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery"(https://arxiv.org/abs/2108.07002).
 
 | 参数名                     | 描述                                | 默认值      |
 |-------------------------|-----------------------------------|-------------|
@@ -78,16 +77,14 @@
 | `mid_channels (int)`    | UNet 中间层的通道数                      | `256`       |
 | `inner_channels (int)`  | 注意力模块内部的通道数                       | `16`        |
 | `num_convs (int)`       | UNet 编码器和解码器中卷积层的数量               | `4`         |
-| `scale_factor (float)`  | 上采样因子,用于将低分辨率掩码图像恢复到高分辨率图像大小的放大倍数 | `4.0`       |
-
-
+| `scale_factor (float)`  | 上采样因子,将低分辨率掩码图像恢复到高分辨率图像大小的放大倍数 | `4.0`       |
 
 
 ## `DSAMNet`
 
 基于PaddlePaddle实现的DSAMNet,用于遥感变化检测。
 
-该模型的原始文章见于 Q. Shi, et al., "A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection"(https://ieeexplore.ieee.org/document/9467555)
+该模型的原始文章见于 Q. Shi, et al., "A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection"(https://ieeexplore.ieee.org/document/9467555).
 
 | 参数名                     | 描述                         | 默认值 |
 |-------------------------|----------------------------|--------|
@@ -98,11 +95,12 @@
 | `ca_ratio (int)`        | 通道注意力模块中的通道压缩比 | `8`    |
 | `sa_kernel (int)`       | 空间注意力模块中的卷积核大小 | `7`    |
 
+
 ## `DSIFN`
 
 基于PaddlePaddle的DSIFN实现。
 
-该模型的原始文章见于 The original article refers to C. Zhang, et al., "A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images"(https://www.sciencedirect.com/science/article/pii/S0924271620301532)
+该模型的原始文章见于 The original article refers to C. Zhang, et al., "A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images"(https://www.sciencedirect.com/science/article/pii/S0924271620301532).
 
 | 参数名                   | 描述                   | 默认值 |
 |-------------------------|----------------------|--------|
@@ -111,11 +109,12 @@
 | `losses (list)`          | 损失函数列表             | `None` |
 | `use_dropout (bool)`     | 是否使用 dropout        | `False`|
 
-## `FC-EF`
+
+## `FCEarlyFusion`
 
 基于PaddlePaddle的FC-EF实现。
 
-该模型的原始文章见于 The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462)
+该模型的原始文章见于 The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462).
 
 | 参数名                     | 描述                          | 默认值 |
 |-------------------------|-------------------------------|--------|
@@ -126,12 +125,11 @@
 | `use_dropout (bool)`    | 是否使用 dropout              | `False`|
 
 
-
-## `FC-Siam-conc`
+## `FCSiamConc`
 
 基于PaddlePaddle的FC-Siam-conc实现。
 
-该模型的原始文章见于 Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462)
+该模型的原始文章见于 Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462).
 
 | 参数名                     | 描述                          | 默认值 |
 |-------------------------|-------------------------------|--------|
@@ -142,11 +140,11 @@
 | `use_dropout (bool)`    | 是否使用 dropout               | `False`|
 
 
-## `FC-Siam-diff`
+## `FCSiamDiff`
 
 基于PaddlePaddle的FC-Siam-diff实现。
 
-该模型的原始文章见于 Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462)
+该模型的原始文章见于 Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462).
 
 | 参数名 | 描述          | 默认值 |
 | --- |-------------|  --- |
@@ -161,7 +159,7 @@
 
 基于PaddlePaddle的FCCDN实现。
 
-该模型的原始文章见于 Pan Chen, et al., "FCCDN: Feature Constraint Network for VHR Image Change Detection"(https://arxiv.org/pdf/2105.10860.pdf)
+该模型的原始文章见于 Pan Chen, et al., "FCCDN: Feature Constraint Network for VHR Image Change Detection"(https://arxiv.org/pdf/2105.10860.pdf).
 
 | 参数名                    | 描述         | 默认值 |
 |--------------------------|------------|--------|
@@ -171,11 +169,11 @@
 | `losses (list)`          | 损失函数列表     | `None` |
 
 
-## `P2V-CD`
+## `P2V`
 
 基于PaddlePaddle的P2V-CD实现。
 
-该模型的原始文章见于 M. Lin, et al. "Transition Is a Process: Pair-to-Video Change Detection Networks for Very High Resolution Remote Sensing Images"(https://ieeexplore.ieee.org/document/9975266)
+该模型的原始文章见于 M. Lin, et al. "Transition Is a Process: Pair-to-Video Change Detection Networks for Very High Resolution Remote Sensing Images"(https://ieeexplore.ieee.org/document/9975266).
 
 | 参数名                     | 描述         | 默认值 |
 |-------------------------|------------|--------|
@@ -184,11 +182,13 @@
 | `losses (list)`         | 损失函数列表     | `None` |
 | `in_channels (int)`     | 输入图像的通道数   | `3`    |
 | `video_len (int)`       | 输入视频帧的数量   | `8`    |
+
+
 ## `SNUNet`
 
 基于PaddlePaddle的SNUNet实现。
 
-该模型的原始文章见于 S. Fang, et al., "SNUNet-CD: A Densely Connected Siamese Network for Change Detection of VHR Images" (https://ieeexplore.ieee.org/document/9355573)
+该模型的原始文章见于 S. Fang, et al., "SNUNet-CD: A Densely Connected Siamese Network for Change Detection of VHR Images" (https://ieeexplore.ieee.org/document/9355573).
 
 | 参数名                     | 描述         | 默认值 |
 |-------------------------|------------| --- |
@@ -196,14 +196,14 @@
 | `use_mixed_loss (bool)` | 是否使用混合损失函数 | `False` |
 | `losses (list)`         | 损失函数列表     | `None` |
 | `in_channels (int)`     | 输入图像的通道数   | `3` |
-| `width (int)`           | 神经网络中的通道数  | `32` |
+| `width (int)`           | 网络中间层特征通道数  | `32` |
 
 
 ## `STANet`
 
 基于PaddlePaddle的STANet实现。
 
-该模型的原始文章见于 H. Chen and Z. Shi, "A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection"(https://www.mdpi.com/2072-4292/12/10/1662)
+该模型的原始文章见于 H. Chen and Z. Shi, "A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection"(https://www.mdpi.com/2072-4292/12/10/1662).
 
 | 参数名                     | 描述                                                 | 默认值 |
 |-------------------------|----------------------------------------------------| --- |
@@ -211,13 +211,15 @@
 | `use_mixed_loss (bool)` | 是否使用混合损失函数                                         | `False` |
 | `losses (list)`         | 损失函数列表                                             | None |
 | `in_channels (int)`     | 输入图像的通道数                                           | `3` |
-| `att_type (str)`        | 注意力模块的类型,可以是`'BAM'`(波段注意力模块) `'CBAM'`(通道和波段注意力模块) | `'BAM'` |
+| `att_type (str)`        | 注意力模块的类型,可以是`'BAM'`或`'CBAM'` | `'BAM'` |
 | `ds_factor (int)`       | 下采样因子,可以是`1`、`2`或`4`                               | `1` |
+
+
 ## `CondenseNetV2`
 
 基于PaddlePaddle的CondenseNetV2实现。
 
-该模型的原始文章见于Yang L, Jiang H, Cai R, et al. “Condensenet v2: Sparse feature reactivation for deep networks” (https://arxiv.org/abs/2104.04382)
+该模型的原始文章见于Yang L, Jiang H, Cai R, et al. “Condensenet v2: Sparse feature reactivation for deep networks” (https://arxiv.org/abs/2104.04382).
 
 | 参数名                     | 描述                         | 默认值 |
 |-------------------------|----------------------------| --- |
@@ -225,7 +227,7 @@
 | `use_mixed_loss (bool)` | 是否使用混合损失函数                 | `False` |
 | `losses (list)`         | 损失函数列表                     | `None` |
 | `in_channels (int)`     | 模型的输入通道数                   | `3` |
-| `arch (str)`            | 模型的架构,可以是`'A'`、`'B'`或`'C'` | `'A'` |
+| `arch (str)`            | 模型使用具体架构,可以是`'A'`、`'B'`或`'C'` | `'A'` |
 
 
 ## `HRNet`
@@ -238,6 +240,7 @@
 | `use_mixed_loss (bool)` | 是否使用混合损失函数 | `False` |
 | `losses (list)`         | 损失函数列表     | `None` |
 
+
 ## `MobileNetV3`
 
 基于PaddlePaddle的MobileNetV3实现。
@@ -249,7 +252,7 @@
 | `losses (list)`         | 损失函数列表     | `None` |
 
 
-## `ResNet50-vd`
+## `ResNet50_vd`
 
 基于PaddlePaddle的ResNet50-vd实现。
 
@@ -259,6 +262,7 @@
 | `use_mixed_loss (bool)` | 是否使用混合损失函数 | `False` |
 | `losses (list)`         | 损失函数列表     | `None` |
 
+
 ## `DRN`
 
 基于PaddlePaddle的DRN实现。
@@ -266,16 +270,17 @@
 | 参数名                     | 描述                                                                                     | 默认值   |
 |-------------------------|----------------------------------------------------------------------------------------|-------|
 | `losses (list)`         | 损失函数列表                                                                                 | `None` |
-| `sr_factor (int)`       | 超分辨率的缩放因子,原始图像的大小将乘以此因子例如,如果原始图像为 `H` x `W`,则输出图像将为 `sr_factor * H` x `sr_factor * W` | `4`   |
-| `min_max (None \| tuple[float, float])` | 图像像素值的最小值和最大值                                                                          | `None` |
-| `scales (tuple[int])` | 缩放因子                                                                                   | `(2, 4)` |
+| `sr_factor (int)`       | 图像超分辨率重建的缩放因子。如果原始图像大小为 `H` x `W`,则输出图像大小将为 `sr_factor * H` x `sr_factor * W` | `4`   |
+| `min_max (None \| tuple[float, float])` | 图像像素值的最小值和最大值。如果未指定,则使用数据类型的默认最小值和最大值                                                                       | `None` |
+| `scales (tuple[int])` | 不同尺度的缩放因子                                                                                   | `(2, 4)` |
 | `n_blocks (int)`           | 残差块的数量                                                                                 | `30`  |
 | `n_feats (int)`            | 残差块中的特征维度                                                                              | `16`  |
 | `n_colors (int)`           | 图像通道数                                                                                  | `3`   |
 | `rgb_range (float)`        | 图像像素值的范围                                                                               | `1.0` |
 | `negval (float)`           | 用于激活函数中的负数值的处理                                                                         | `0.2` |
-| `lq_loss_weight (float)`   | 低质量图像损失的权重,用来控制将低分辨率的输入图像恢复成高分辨率的输出图像的重构损失对于总体损失的影响程度。                                            | `0.1` |
-| `dual_loss_weight (float)` | 双重损失的权重                                                                                | `0.1` |
+| `lq_loss_weight (float)`   | Primal regression loss 的权重                                           | `0.1` |
+| `dual_loss_weight (float)` | Dual regression loss 的权重                                                                                | `0.1` |
+
 
 ## `ESRGAN`
 
@@ -284,13 +289,14 @@
 | 参数名                  | 描述                                                                                     | 默认值 |
 |----------------------|----------------------------------------------------------------------------------------| --- |
 | `losses (list)`      | 损失函数列表                                                                                 | `None` |
-| `sr_factor (int)`    | 超分辨率的缩放因子,原始图像的大小将乘以此因子例如,如果原始图像为 `H` x `W`,则输出图像将为 `sr_factor * H` x `sr_factor * W` | `4` |
-| `min_max (tuple)`    | 输入图像的像素值的最小值和最大值。如果未指定,则使用数据类型的默认最小值和最大值                                              | `None` |
-| `use_gan (bool)`     | 布尔值,指示是否在训练过程中使用 GAN (生成对抗网络)。如果是,将使用 GAN。                                             | `True` |
+| `sr_factor (int)`    | 图像超分辨率重建的缩放因子。如果原始图像大小为 `H` x `W`,则输出图像大小将为 `sr_factor * H` x `sr_factor * W` | `4` |
+| `min_max (tuple)`    | 输入图像的像素值的最小值和最大值。如果未指定,则使用数据类型的默认最小值和最大值                                              | `None` |
+| `use_gan (bool)`     | 是否在训练过程中使用 GAN (生成对抗网络)                                         | `True` |
 | `in_channels (int)`  | 输入图像的通道数                                                                               | `3` |
-| `out_channels (int)` | 输出图像的通道数。                                                                        | `3` |
-| `nf (int)`           | 模型第一层卷积层的滤波器数量。                                                                        | `64` |
-| `nb (int)`           | 模型中残差块的数量。                                                                             | `23` |
+| `out_channels (int)` | 输出图像的通道数                                                                        | `3` |
+| `nf (int)`           | 模型第一层卷积层的滤波器数量                                                                        | `64` |
+| `nb (int)`           | 模型中残差块的数量                                                                             | `23` |
+
 
 ## `LESRCNN`
 
@@ -299,82 +305,84 @@
 | 参数名                  | 描述                                                                                      | 默认值 |
 |----------------------|-----------------------------------------------------------------------------------------| --- |
 | `losses (list)`      | 损失函数列表                                                                                  | `None` |
-| `sr_factor (int)`    | 超分辨率的缩放因子,原始图像的大小将乘以此因子。例如,如果原始图像为 `H` x `W`,则输出图像将为 `sr_factor * H` x `sr_factor * W`。 | `4` |
-| `min_max (tuple)`    | 输入图像的像素值的最小值和最大值。如果未指定,则使用数据类型的默认最小值和最大值。                                               | `None` |
-| `multi_scale (bool)` | 布尔值,指示是否在多个尺度下进行训练。如果是,则在训练过程中使用多个尺度。                                                   | `False` |
-| `group (int)`        | 控制卷积操作的组数。如果设置为 `1`,则为标准卷积;如果设置为输入通道数,则为 DWConv。                                        | `1` |
+| `sr_factor (int)`    | 图像超分辨率重建的缩放因子。如果原始图像大小为 `H` x `W`,则输出图像大小将为 `sr_factor * H` x `sr_factor * W` | `4` |
+| `min_max (tuple)`    | 输入图像的像素值的最小值和最大值。如果未指定,则使用数据类型的默认最小值和最大值                                               | `None` |
+| `multi_scale (bool)` | 是否在多个尺度下进行训练                                                 | `False` |
+| `group (int)`        | 卷积操作的分组数量                                        | `1` |
+
 
-## `Faster R-CNN`
+## `FasterRCNN`
 
 基于PaddlePaddle的Faster R-CNN实现。
 
 | 参数名                           | 描述                                                  | 默认值 |
 |-------------------------------|-----------------------------------------------------| --- |
 | `num_classes (int)`           | 目标类别数量                                              | `80` |
-| `backbone (str)`              | Faster R-CNN的主干网络                                   | `'ResNet50'` |
-| `with_fpn (bool)`             | 布尔值,指示是否使用特征金字塔网络 (FPN)。                            | `True` |
-| `with_dcn (bool)`             | 布尔值,指示是否使用 Deformable Convolutional Networks (DCN)。 | `False` |
-| `aspect_ratios (list)`        | 候选框的宽高比列表。                                          | `[0.5, 1.0, 2.0]` |
-| `anchor_sizes (list)`         | 候选框的大小列表,表示为每个特征图上的基本大小。                            | `[[32], [64], [128], [256], [512]]` |
-| `keep_top_k (int)`            | 在进行 NMS 操作之前,保留的预测框的数量。                             | `100` |
-| `nms_threshold (float)`       | 使用的非极大值抑制 (NMS) 阈值。                                 | `0.5` |
-| `score_threshold (float)`     | 过滤预测框的分数阈值。                                         | `0.05` |
-| `fpn_num_channels (int)`      | FPN 网络中每个金字塔层的通道数。                                  | `256` |
-| `rpn_batch_size_per_im (int)` | RPN 网络中每张图像的正负样本比例。                                 | `256` |
-| `rpn_fg_fraction (float)`     | RPN 网络中前景样本的比例。                                     | `0.5` |
-| `test_pre_nms_top_n (int)`    | 测试时,进行 NMS 操作之前保留的预测框的数量。如果未指定,则使用 `keep_top_k`。    | `None` |
-| `test_post_nms_top_n (int)`   | 测试时,进行 NMS 操作之后保留的预测框的数量。                           | `1000` |
-
-## `PP-YOLO`
+| `backbone (str)`              | 骨干网络名称                                   | `'ResNet50'` |
+| `with_fpn (bool)`             | 是否使用特征金字塔网络 (FPN)                            | `True` |
+| `with_dcn (bool)`             | 是否使用 Deformable Convolutional Networks (DCN) | `False` |
+| `aspect_ratios (list)`        | 候选框的宽高比列表                                          | `[0.5, 1.0, 2.0]` |
+| `anchor_sizes (list)`         | 候选框的大小列表,表示为每个特征图上的基本大小                            | `[[32], [64], [128], [256], [512]]` |
+| `keep_top_k (int)`            | 在进行非极大值抑制(NMS)操作之前,保留的预测框的数量                             | `100` |
+| `nms_threshold (float)`       | 使用的 NMS 阈值                                 | `0.5` |
+| `score_threshold (float)`     | 过滤预测框的分数阈值                                         | `0.05` |
+| `fpn_num_channels (int)`      | FPN 网络中每个金字塔层的通道数                                  | `256` |
+| `rpn_batch_size_per_im (int)` | RPN 网络中每张图像的正负样本比例                                 | `256` |
+| `rpn_fg_fraction (float)`     | RPN 网络中前景样本的比例                                     | `0.5` |
+| `test_pre_nms_top_n (int)`    | 测试时,进行 NMS 操作之前保留的预测框数量。如果未指定,则使用 `keep_top_k`    | `None` |
+| `test_post_nms_top_n (int)`   | 测试时,进行 NMS 操作之后保留的预测框数量                           | `1000` |
+
+
+## `PPYOLO`
 
 基于PaddlePaddle的PP-YOLO实现。
 
 | 参数名                              | 描述                  | 默认值 |
 |----------------------------------|---------------------| --- |
 | `num_classes (int)`              | 目标类别数量              | `80` |
-| `backbone (str)`                 | PPYOLO 的主干网络        | `'ResNet50_vd_dcn'` |
+| `backbone (str)`                 | 骨干网络名称        | `'ResNet50_vd_dcn'` |
 | `anchors (list[list[float]])`    | 预定义锚框的大小            | `None` |
 | `anchor_masks (list[list[int]])` | 预定义锚框的掩码            | `None` |
 | `use_coord_conv (bool)`          | 是否使用坐标卷积            | `True` |
 | `use_iou_aware (bool)`           | 是否使用 IoU 感知         | `True` |
 | `use_spp (bool)`                 | 是否使用空间金字塔池化(SPP)    | `True` |
-| `use_drop_block (bool)`          | 是否使用 DropBlock 正则化  | `True` |
+| `use_drop_block (bool)`          | 是否使用 DropBlock  | `True` |
 | `scale_x_y (float)`              | 对每个预测框进行缩放的参数       | `1.05` |
 | `ignore_threshold (float)`       | IoU 阈值,用于将预测框分配给真实框 | `0.7` |
 | `label_smooth (bool)`            | 是否使用标签平滑            | `False` |
-| `use_iou_loss (bool)`            | 是否使用 IoU Loss       | `True` |
+| `use_iou_loss (bool)`            | 是否使用 IoU loss       | `True` |
 | `use_matrix_nms (bool)`          | 是否使用 Matrix NMS     | `True` |
-| `nms_score_threshold (float)`    | NMS  的分数阈值          | `0.01` |
-| `nms_topk (int)`                 | 在执行 NMS 之前保留的最大检测数  | `-1` |
-| `nms_keep_topk (int)`            | NMS 后保留的最大预测框数     | `100`|
+| `nms_score_threshold (float)`    | NMS 的分数阈值          | `0.01` |
+| `nms_topk (int)`                 | 在执行 NMS 之前保留的最大预测框数  | `-1` |
+| `nms_keep_topk (int)`            | 在执行 NMS 后保留的最大预测框数     | `100`|
 | `nms_iou_threshold (float)`      | NMS IoU 阈值          | `0.45`  |
 
 
-## `PP-YOLO Tiny`
+## `PPYOLOTiny`
 
 基于PaddlePaddle的PP-YOLO Tiny实现。
 
 | 参数名                              | 描述                    | 默认值 |
 |----------------------------------|-----------------------| --- |
 | `num_classes (int)`              | 目标类别数量                | `80` |
-| `backbone (str)`                 | PP-YOLO Tiny的主干网络     | `'MobileNetV3'` |
-| `anchors (list[list[float]])`    | anchor box 大小列表       | `[[10, 15], [24, 36], [72, 42], [35, 87], [102, 96], [60, 170], [220, 125], [128, 222], [264, 266]]` |
-| `anchor_masks (list[list[int]])` | anchor box 掩码         | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
-| `use_iou_aware (bool)`           | 布尔值,指示是否使用 IoU-aware loss | `False` |
-| `use_spp (bool)`                 | 布尔值,指示是否使用 SPP 模块     | `True` |
-| `use_drop_block (bool)`          | 布尔值,指示是否使用 DropBlock 模块 | `True` |
-| `scale_x_y (float)`              | 缩放参数                  | `1.05` |
-| `ignore_threshold (float)`       | 忽略阈值                  | `0.5` |
-| `label_smooth (bool)`            | 布尔值,指示是否使用标签平滑        | `False` |
-| `use_iou_loss (bool)`            | 布尔值,指示是否使用 IoU Loss   | `True` |
-| `use_matrix_nms (bool)`          | 布尔值,指示是否使用 Matrix NMS | `False` |
-| `nms_score_threshold (float)`    | NMS 得分阈值              | `0.005` |
-| `nms_topk (int)`                 | NMS 操作前保留的边界框数        | `1000` |
-| `nms_keep_topk (int)`            | NMS 操作后保留的边界框数        | `100` |
+| `backbone (str)`                 | 骨干网络名称     | `'MobileNetV3'` |
+| `anchors (list[list[float]])`    | 预定义锚框的大小       | `[[10, 15], [24, 36], [72, 42], [35, 87], [102, 96], [60, 170], [220, 125], [128, 222], [264, 266]]` |
+| `anchor_masks (list[list[int]])` | 预定义锚框的掩码         | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
+| `use_iou_aware (bool)`           | 是否使用 IoU 感知 | `False` |
+| `use_spp (bool)`                 | 是否使用空间金字塔池化(SPP)     | `True` |
+| `use_drop_block (bool)`          | 是否使用 DropBlock | `True` |
+| `scale_x_y (float)`              | 对每个预测框进行缩放参数                  | `1.05` |
+| `ignore_threshold (float)`       | IoU 阈值,用于将预测框分配给真实框                  | `0.5` |
+| `label_smooth (bool)`            | 是否使用标签平滑        | `False` |
+| `use_iou_loss (bool)`            | 是否使用 IoU loss   | `True` |
+| `use_matrix_nms (bool)`          | 是否使用 Matrix NMS | `False` |
+| `nms_score_threshold (float)`    | NMS 的分数阈值              | `0.005` |
+| `nms_topk (int)`                 | 在执行 NMS 之前保留的最大预测框数        | `1000` |
+| `nms_keep_topk (int)`            | 在执行 NMS 之后保留的最大预测框数        | `100` |
 | `nms_iou_threshold (float)`      | NMS IoU 阈值            | `0.45` |
 
 
-## `PP-YOLOv2`
+## `PPYOLOv2`
 
 基于PaddlePaddle的PP-YOLOv2实现。
 
@@ -382,22 +390,23 @@
 | 参数名                              | 描述                  | 默认值 |
 |----------------------------------|---------------------| --- |
 | `num_classes (int)`              | 目标类别数量              | `80` |
-| `backbone (str)`                 | PPYOLO 的骨干网络        | `'ResNet50_vd_dcn'` |
+| `backbone (str)`                 | 骨干网络名称        | `'ResNet50_vd_dcn'` |
 | `anchors (list[list[float]])`    | 预定义锚框的大小            | `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]]` |
 | `anchor_masks (list[list[int]])` | 预定义锚框的掩码            | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
 | `use_iou_aware (bool)`           | 是否使用 IoU 感知         | `True` |
-| `use_spp (bool)`                 | 是否使用空间金字塔池化( SPP )  | `True` |
-| `use_drop_block (bool)`          | 是否使用 DropBlock 正则化  | `True` |
+| `use_spp (bool)`                 | 是否使用空间金字塔池化(SPP)  | `True` |
+| `use_drop_block (bool)`          | 是否使用 DropBlock  | `True` |
 | `scale_x_y (float)`              | 对每个预测框进行缩放的参数       | `1.05` |
 | `ignore_threshold (float)`       | IoU 阈值,用于将预测框分配给真实框 | `0.7` |
 | `label_smooth (bool)`            | 是否使用标签平滑            | `False` |
-| `use_iou_loss (bool)`            | 是否使用 IoU Loss       | `True` |
+| `use_iou_loss (bool)`            | 是否使用 IoU loss       | `True` |
 | `use_matrix_nms (bool)`          | 是否使用 Matrix NMS     | `True` |
 | `nms_score_threshold (float)`    | NMS 的分数阈值           | `0.01` |
-| `nms_topk (int)`                 | 在执行 NMS 之前保留的最大检测数  | `-1` |
-| `nms_keep_topk (int)`            | NMS 后保留的最大预测框数     | `100`|
+| `nms_topk (int)`                 | 在执行 NMS 之前保留的最大预测框数  | `-1` |
+| `nms_keep_topk (int)`            | 在执行 NMS 后保留的最大预测框数     | `100`|
 | `nms_iou_threshold (float)`      | NMS IoU 阈值          | `0.45`  |
 
+
 ## `YOLOv3`
 
 基于PaddlePaddle的YOLOv3实现。
@@ -405,17 +414,18 @@
 | 参数名 | 描述                            | 默认值 |
 | --- |-------------------------------| --- |
 | `num_classes (int)` | 目标类别数量                        | `80` |
-| `backbone (str)` | YOLOv3的主干网络的名称                | `'MobileNetV1'` |
-| `anchors (list[list[int]])` | 所有锚框的大小                       | `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]]` |
-| `anchor_masks (list[list[int]])` | 使用哪些锚框来预测目标框                  | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
-| `ignore_threshold (float)` | 预测框和真实框的 IoU 阈值,低于该阈值将被视为背景   | `0.7` |
-| `nms_score_threshold (float)` | 非极大值抑制中,分数阈值,低于该分数的框将被丢弃      | `0.01` |
-| `nms_topk (int)` | 非极大值抑制中,保留的最大得分框数,为`-1`则保留所有框 | `1000` |
-| `nms_keep_topk (int)` | 非极大值抑制中,每个图像保留的最大框数           | `100` |
-| `nms_iou_threshold (float)` | 非极大值抑制中,IoU 阈值,大于该阈值的框将被丢弃    | `0.45` |
-| `label_smooth (bool)` | 是否在计算损失时使用标签平滑                | `False` |
-
-## `BiSeNet V2`
+| `backbone (str)` | 骨干网络名称                | `'MobileNetV1'` |
+| `anchors (list[list[int]])` | 预定义锚框的大小                       | `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]]` |
+| `anchor_masks (list[list[int]])` | 预定义锚框的掩码                  | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
+| `ignore_threshold (float)` | IoU 阈值,用于将预测框分配给真实框   | `0.7` |
+| `nms_score_threshold (float)` | NMS 的分数阈值      | `0.01` |
+| `nms_topk (int)` | 在执行 NMS 之前保留的最大预测框数  | `1000` |
+| `nms_keep_topk (int)` | 在执行 NMS 之后保留的最大预测框数            | `100` |
+| `nms_iou_threshold (float)` | NMS IoU 阈值    | `0.45` |
+| `label_smooth (bool)` | 是否使用标签平滑                 | `False` |
+
+
+## `BiSeNetV2`
 
 基于PaddlePaddle的BiSeNet V2实现。
 
@@ -428,9 +438,7 @@
 | `align_corners (bool)`  | 是否使用角点对齐方法 | `False`  |
 
 
-
-
-## `DeepLab V3+`
+## `DeepLabV3P`
 
 基于PaddlePaddle的DeepLab V3+实现。
 
@@ -438,15 +446,16 @@
 |----------------------------|---------------------| --- |
 | `in_channels (int)`        | 输入图像的通道数            | `3` |
 | `num_classes (int)`        | 目标类别数量              | `2` |
-| `backbone (str)`           | DeepLab V3+的主干网络         | `ResNet50_vd` |
+| `backbone (str)`           | 骨干网络名称        | `ResNet50_vd` |
 | `use_mixed_loss (bool)`    | 是否使用混合损失函数          | `False` |
 | `losses (list)`            | 损失函数列表              | `None` |
 | `output_stride (int)`      | 输出特征图相对于输入特征图的下采样倍率 | `8` |
-| `backbone_indices (tuple)` | 输出主干网络不同阶段的位置索引     | `(0, 3)` |
+| `backbone_indices (tuple)` | 一个索引列表,用于取出骨干网络不同阶段的特征送入解码器     | `(0, 3)` |
 | `aspp_ratios (tuple)`      | 空洞卷积的扩张率            | `(1, 12, 24, 36)` |
 | `aspp_out_channels (int)`  | ASPP 模块输出通道数        | `256` |
 | `align_corners (bool)`     | 是否使用角点对齐方法          | `False` |
 
+
 ## `FactSeg`
 
 基于PaddlePaddle的FactSeg实现。
@@ -461,13 +470,13 @@
 | `use_mixed_loss (bool)` | 是否使用混合损失函数                     | `False` |
 | `losses (list)`         | 损失函数列表                         | `None` |
 
+
 ## `FarSeg`
 
 基于PaddlePaddle的FarSeg实现。
 
 该模型的原始文章见于 Zheng Z, Zhong Y, Wang J, et al. Foreground-aware relation network for geospatial object segmentation in high spatial resolution remote sensing imagery[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 4096-4105.
 
-
 | 参数名                     | 描述                             | 默认值 |
 |-------------------------|--------------------------------| --- |
 | `in_channels (int)`     | 输入图像的通道数                       | `3` |
@@ -475,7 +484,8 @@
 | `use_mixed_loss (bool)` | 是否使用混合损失函数                     | `False` |
 | `losses (list)`         | 损失函数列表                         | `None` |
 
-## `Fast-SCNN`
+
+## `FastSCNN`
 
 基于PaddlePaddle的Fast-SCNN实现。
 
@@ -496,14 +506,12 @@
 |-------------------------|--------------------------------| --- |
 | `in_channels (int)`     | 输入图像的通道数                       | `3` |
 | `num_classes (int)`     | 目标类别数量                         | `2` |
-| `width (int)`           | 网络的初始通道数                       | `48` |
+| `width (int)`           | 网络的初始特征通道数                       | `48` |
 | `use_mixed_loss (bool)` | 是否使用混合损失函数                     | `False` |
 | `losses (list)`         | 损失函数列表                         | `None` |
 | `align_corners (bool)`  | 是否使用角点对齐方法                     | `False` |
 
 
-
-
 ## `UNet`
 
 基于PaddlePaddle的UNet实现。

+ 120 - 81
docs/intro/model_cons_params_en.md

@@ -2,7 +2,7 @@
 
 # PaddleRS Model Construction Parameters
 
-This document describes the construction parameters of each PaddleRS model trainer in detail, including their parameter names, parameter types, parameter descriptions and default values.
+This document describes the construction parameters of each PaddleRS model trainer, including their parameter names, parameter types, parameter descriptions, and default values.
 
 ## `BIT`
 
@@ -16,21 +16,22 @@ This implementation adopts pretrained encoders, as opposed to the original work
 |-----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|-------------|
 | `in_channels (int)`               | Number of channels of the input image                                                                                                          | `3` |
 | `num_classes (int)`               | Number of target classes                                                                                                                     | `2` |
-| `use_mixed_loss (bool, optional)` | Whether to use mixed loss function                                                                                                 | `False` |
-| `losses (list, optional)`         | List of loss functions                                                                                                                       | `None` |
-| `att_type (str, optional)`        | Spatial attention type, optional values are `'CBAM'` and `'BAM'`                                                                                 | `'CBAM'` |
-| `ds_factor (int, optional)`       | Downsampling factor                                                                                                                          | `1` |
-| `backbone (str, optional)`        | ResNet architecture to use as backbone. Currently only `'resnet18'` and `'resnet34'` are supported                                               | `'resnet18'` |
-| `n_stages (int, optional)`        | Number of ResNet stages used in the backbone, should be a value in `{3, 4, 5}`                                                                 | `4` |
-| `use_tokenizer (bool, optional)`  | Whether to use tokenizer                                                                                                                     | `True` |
-| `token_len (int, optional)`       | Length of input token                                                                                                                        | `4` |
-| `pool_mode (str, optional)`       | Gets the pooling strategy for input tokens when `'use_tokenizer'` is set to False. `'max'` means global max pooling, `'avg'` means global average pooling | `'max'` |
-| `pool_size (int, optional)`       | When `'use_tokenizer'` is set to False, the height and width of the pooled feature map                                                        | `2` |
-| `enc_with_pos (bool, optional)`   | Whether to add learned positional embeddings to the encoder's input feature sequence                                                         | `True` |
-| `enc_depth (int, optional)`       | Number of attention blocks used in encoder                                                                                                   | `1` |
-| `enc_head_dim (int, optional)`    | Embedding dimension of each encoder head                                                                                                     | `64` |
-| `dec_depth (int, optional)`       | Number of attention blocks used in decoder                                                                                                   | `8` |
-| `dec_head_dim (int, optional)`    | Embedding dimension for each decoder head                                                                                                    | `8` |
+| `use_mixed_loss (bool)` | Whether to use mixed loss function                                                                                                 | `False` |
+| `losses (list)`         | List of loss functions                                                                                                                       | `None` |
+| `att_type (str)`        | Spatial attention type values are `'CBAM'` and `'BAM'`                                                                                 | `'CBAM'` |
+| `ds_factor (int)`       | Downsampling factor                                                                                                                          | `1` |
+| `backbone (str)`        | ResNet architecture to use as backbone. Currently only `'resnet18'` and `'resnet34'` are supported                                               | `'resnet18'` |
+| `n_stages (int)`        | Number of ResNet stages used in the backbone, should be a value in `{3, 4, 5}`                                                                 | `4` |
+| `use_tokenizer (bool)`  | Whether to use tokenizer                                                                                                                     | `True` |
+| `token_len (int)`       | Length of input token                                                                                                                        | `4` |
+| `pool_mode (str)`       | Gets the pooling strategy for input tokens when `'use_tokenizer'` is set to False. `'max'` means global max pooling, `'avg'` means global average pooling | `'max'` |
+| `pool_size (int)`       | When `'use_tokenizer'` is set to False, the height and width of the pooled feature map                                                        | `2` |
+| `enc_with_pos (bool)`   | Whether to add learned positional embeddings to the encoder's input feature sequence                                                         | `True` |
+| `enc_depth (int)`       | Number of attention blocks used in encoder                                                                                                   | `1` |
+| `enc_head_dim (int)`    | Embedding dimension of each encoder head                                                                                                     | `64` |
+| `dec_depth (int)`       | Number of attention blocks used in decoder                                                                                                   | `8` |
+| `dec_head_dim (int)`    | Embedding dimension for each decoder head                                                                                                    | `8` |
+
 
 ## `CDNet`
 
@@ -45,6 +46,7 @@ The original article refers to Pablo F. Alcantarilla, et al., "Street-View Chang
 | `losses (list)`         | List of loss functions                                                                             | `None` |
 | `in_channels (int)`     | Number of channels of the input image                                                              | `6` |
 
+
 ## `ChangeFormer`
 
 The ChangeFormer implementation based on PaddlePaddle.
@@ -60,7 +62,8 @@ The original article refers to Wele Gedara Chaminda Bandara, Vishal M. Patel, 
 | `decoder_softmax (bool)` | Whether to use softmax as the last layer activation function of the decoder | `False` |
 | `embed_dim (int)` | Hidden layer dimension of the Transformer encoder                           | `256` |
 
-## `ChangeStar_FarSeg`
+
+## `ChangeStar`
 
 The ChangeStar implementation with a FarSeg encoder based on PaddlePaddle.
 
@@ -76,6 +79,7 @@ The original article refers to Z. Zheng, et al., "Change is Everywhere: Single-T
 | `num_convs (int)`       | Number of convolutional layers in UNet encoder and decoder          | `4` |
 | `scale_factor (float)`  | Upsampling factor to scale the size of the output segmentation mask | `4.0` |
 
+
 ## `DSAMNet`
 
 The DSAMNet implementation based on PaddlePaddle.
@@ -91,6 +95,7 @@ The original article refers to Q. Shi, et al., "A Deeply Supervised Attention Me
 | `ca_ratio (int)` | Channel compression ratio in channel attention module                                                                           | `8` |
 | `sa_kernel (int)` | Kernel size in the spatial attention module                                                                                     | `7` |
 
+
 ## `DSIFN`
 
 The DSIFN implementation based on PaddlePaddle.
@@ -104,7 +109,8 @@ The original article refers to C. Zhang, et al., "A deeply supervised image fusi
 | `losses (list)` | List of loss functions                                                                             | `None` |
 | `use_dropout (bool)` | Whether to use dropout                                                                             | `False`|
 
-## `FC-EF`
+
+## `FCEarlyFusion`
 
 The FC-EF implementation based on PaddlePaddle.
 
@@ -118,7 +124,8 @@ The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional s
 | `in_channels (int)`     | Number of channels of the input image | `6` |
 | `use_dropout (bool)`    | Whether to use dropout                | `False`|
 
-## `FC-Siam-conc`
+
+## `FCSiamConc`
 
 The FC-Siam-conc implementation based on PaddlePaddle.
 
@@ -132,7 +139,8 @@ The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional s
 | `in_channels (int)`     | Number of channels of the input image                                                                                           | `3` |
 | `use_dropout (bool)`    | Whether to use dropout                                                                                                          | `False`|
 
-## `FC-Siam-diff`
+
+## `FCSiamDiff`
 
 The FC-Siam-diff implementation based on PaddlePaddle.
 
@@ -146,6 +154,7 @@ The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional s
 | `in_channels (int)`     | Number of channels of the input image                                                          | int | `3` |
 | `use_dropout (bool)`    | Whether to use dropout                                         | `False` |
 
+
 ## `FCCDN`
 
 The FCCDN implementation based on PaddlePaddle.
@@ -159,7 +168,8 @@ The original article refers to Pan Chen, et al., "FCCDN: Feature Constraint Netw
 | `use_mixed_loss (bool)` | Whether to use mixed loss             | `False`|
 | `losses (list)` | List of loss functions                | `None` |
 
-## `P2V-CD`
+
+## `P2V`
 
 The P2V-CD implementation based on PaddlePaddle.
 
@@ -173,6 +183,7 @@ The original article refers to M. Lin, et al. "Transition Is a Process: Pair-to-
 | `in_channels (int)`     | Number of channels of the input image | `3` |
 | `video_len (int)`       | Number of input video frames          | `8` |
 
+
 ## `SNUNet`
 
 The SNUNet implementation based on PaddlePaddle.
@@ -183,7 +194,8 @@ The original article refers to S. Fang, et al., "SNUNet-CD: A Densely Connected
 |------------------------|-------------------------------------------------|------|
 | `in_channels (int)`    | Number of channels of the input image           |      |
 | `num_classes (int)`      | Number of target classes                        |      |
-| `width (int, optional)` | Output channels of the first convolutional layer | 32   |
+| `width (int)` | Output channels of the first convolutional layer | 32   |
+
 
 ## `STANet`
 
@@ -199,6 +211,7 @@ The original article refers to  H. Chen and Z. Shi, "A Spatial-Temporal Attentio
 | `in_channels (int)`     | Number of channels of the input image    | `3` |
 | `width (int)`           | Number of channels in the neural network | `32` |
 
+
 ## `CondenseNetV2`
 
 The CondenseNetV2 implementation based on PaddlePaddle.
@@ -209,9 +222,10 @@ The CondenseNetV2 implementation based on PaddlePaddle.
 | `use_mixed_loss (bool)` | Whether to use mixed loss function                      | `False` |
 | `losses (list)`         | List of loss functions                                  | `None` |
 | `in_channels (int)`     | Number of channels of the input image                   | `3` |
-| `arch (str)`            | Architecture of the model, can be `'A'`, `'B'` or `'C'` | `'A'` |
+| `arch (str)`            | Architecture of the model, which can be `'A'`, `'B'`, or `'C'` | `'A'` |
 
-##  `HRNet`
+
+## `HRNet`
 
 The HRNet implementation based on PaddlePaddle.
 
@@ -222,7 +236,7 @@ The HRNet implementation based on PaddlePaddle.
 | `losses (list)`         | List of loss functions             | `None` |
 
 
-##  `MobileNetV3`
+## `MobileNetV3`
 
 The MobileNetV3 implementation based on PaddlePaddle.
 
@@ -233,7 +247,7 @@ The MobileNetV3 implementation based on PaddlePaddle.
 | `losses (list)`         | List of loss functions | `None` |
 
 
-##  `ResNet50-vd`
+## `ResNet50_vd`
 
 The ResNet50-vd implementation based on PaddlePaddle.
 
@@ -243,6 +257,7 @@ The ResNet50-vd implementation based on PaddlePaddle.
 | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
 | `losses (list)`         | List of loss functions | `None` |
 
+
 ## `DRN`
 
 The DRN implementation based on PaddlePaddle.
@@ -250,16 +265,16 @@ The DRN implementation based on PaddlePaddle.
 | Parameter Name                                                    | Description                                                                                                                                                                                                         | Default Value |
 |-------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
 | `losses (list)`                                                   | List of loss functions                                                                                                                                                                                              | `None` |
-| `sr_factor (int)`                                                 | Scaling factor for super-resolution, the size of the original image will be multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W`. | `4` |
-| `min_max (None \| tuple[float, float])`                                                                                                                                                                                               | Minimum and maximum image pixel values                                                                                                                                                                              | `None` |
+| `sr_factor (int)`                                                 | Scaling factor for super-resolution. The output image size will be the original image size multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W` | `4` |
+| `min_max (None \| tuple[float, float])`                                                                                                                                                                                               | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used                                                                                                                                                                              | `None` |
 | `scales (tuple[int])`                                        | Scaling factor                                                                                                                                                                                                      | `(2, 4)` |
 | `n_blocks (int)`                                                  | Number of residual blocks                                                                                                                                                                                           | `30` |
 | `n_feats (int)`                                                   | Number of features in the residual block                                                                                                                                                                            | `16` |
 | `n_colors (int)`                                                  | Number of image channels                                                                                                                                                                                            | `3` |
 | `rgb_range (float)`                                               | Range of image pixel values                                                                                                                                                                                         | `1.0` |
 | `negval (float)`                                                  | Negative value in nonlinear mapping                                                                                                                                                                                 | `0.2` |
-| `Supplementary Description of `lq_loss_weight` parameter (float)` | Weight of the low-quality image loss, which is used to control the impact of the reconstruction loss on the overall loss of restoring the low-resolution input image into a high-resolution output image.           | `0.1` |
-| `dual_loss_weight (float)`                                        | Weight of the bilateral loss                                                                                                                                                                                        | `0.1` |
+| `Supplementary Description of `lq_loss_weight` parameter (float)` | Weight of the primal regression loss           | `0.1` |
+| `dual_loss_weight (float)`                                        | Weight of the dual regression loss                                                                                                                                                                                        | `0.1` |
 
 
 ## `ESRGAN`
@@ -269,14 +284,15 @@ The ESRGAN implementation based on PaddlePaddle.
 | Parameter Name       | Description                                                                                                                                                                                                        | Default Value |
 |----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| --- |
 | `losses (list)`      | List of loss functions                                                                                                                                                                                             | `None` |
-| `sr_factor (int)`    | Scaling factor for super-resolution, the size of the original image will be multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W` | `4` |
+| `sr_factor (int)`    | Scaling factor for super-resolution. The output image size will be the original image size multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W` | `4` |
 | `min_max (tuple)`    | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used                                                                                 | `None` |
-| `use_gan (bool)`     | Boolean indicating whether to use GAN (Generative Adversarial Network) during training. If yes, GAN will be used                                                                                                   | `True` |
+| `use_gan (bool)`     | Whether to use GAN (Generative Adversarial Network) during training. If yes, GAN will be used                                                                                                   | `True` |
 | `in_channels (int)`  | Number of channels of the input image                                                                                                                                                                              | `3` |
 | `out_channels (int)` | Number of channels of the output image                                                                                                                                                                             | `3` |
 | `nf (int)`           | Number of filters in the first convolutional layer of the model                                                                                                                                                    | `64` |
 | `nb (int)`           | Number of residual blocks in the model                                                                                                                                                                             | `23` |
 
+
 ## `LESRCNN`
 
 The LESRCNN implementation based on PaddlePaddle.
@@ -284,25 +300,26 @@ The LESRCNN implementation based on PaddlePaddle.
 | Parameter Name       | Description                                                                                                                                                                                                     | Default Value |
 |----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| --- |
 | `losses (list)`      | List of loss functions                                                                                                                                                                                                                | `None` |
-| `sr_factor (int)`    | Scaling factor for super-resolution, the size of the original image will be multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W`. | `4` |
-| `min_max (tuple)`    | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used.                                                                             | `None` |
-| `multi_scale (bool)` | Boolean indicating whether to train on multiple scales. If yes, multiple scales are used during training.                                                                                                       | `False` |
-| `group (int)`        | Controls the number of groups for convolution operations. Standard convolution if set to `1`, DWConv if set to the number of input channels.                                                                    | `1` |
+| `sr_factor (int)`    | Scaling factor for super-resolution. The output image size will be the original image size multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W` | `4` |
+| `min_max (tuple)`    | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used                                                                             | `None` |
+| `multi_scale (bool)` | Whether to train on multiple scales. If yes, multiple scales are used during training                                                                                                       | `False` |
+| `group (int)`        | Number of groups used in convolution operations.                                                                    | `1` |
+
 
-##  `Faster R-CNN`
+##  `FasterRCNN`
 
 The Faster R-CNN implementation based on PaddlePaddle.
 
 | Parameter Name                | Description                                                                                                | Default Value |
 |-------------------------------|------------------------------------------------------------------------------------------------------------| --- |
 | `num_classes (int)`           | Number of target classes                                                                                   | `80` |
-| `backbone (str)`              | Backbone network model to use                                                                              | `'ResNet50'` |
-| `with_fpn (bool)`             | Boolean indicating whether to use Feature Pyramid Network (FPN)                                            | `True` |
-| `with_dcn (bool)`             | Boolean indicating whether to use Deformable Convolutional Networks (DCN)                                  | `False` |
+| `backbone (str)`              | Backbone network to use                                                                              | `'ResNet50'` |
+| `with_fpn (bool)`             | Whether to use Feature Pyramid Network (FPN)                                            | `True` |
+| `with_dcn (bool)`             | Whether to use Deformable Convolutional Networks (DCN)                                  | `False` |
 | `aspect_ratios (list)`        | List of aspect ratios of candidate boxes                                                                   | `[0.5, 1.0, 2.0]` |
 | `anchor_sizes (list)`         | list of sizes of candidate boxes expressed as base sizes on each feature map                               | `[[32], [64], [128], [256], [512]]` |
-| `keep_top_k (int)`            | Number of predicted boxes to keep before NMS operation                                                     | `100` |
-| `nms_threshold (float)`       | Non-maximum suppression (NMS) threshold to use                                                             | `0.5` |
+| `keep_top_k (int)`            | Number of predicted boxes to keep before the non-maximum suppression (NMS) operation                                                     | `100` |
+| `nms_threshold (float)`       | NMS threshold to use                                                             | `0.5` |
 | `score_threshold (float)`     | Score threshold for filtering predicted boxes                                                              | `0.05` |
 | `fpn_num_channels (int)`      | Number of channels for each pyramid layer in the FPN network                                               | `256` |
 | `rpn_batch_size_per_im (int)` | Ratio of positive and negative samples per image in the RPN network                                        | `256` |
@@ -310,76 +327,80 @@ The Faster R-CNN implementation based on PaddlePaddle.
 | `test_pre_nms_top_n (int)`    | Number of predicted boxes to keep before NMS operation when testing. If not specified, `keep_top_k` is used. | `None` |
 | `test_post_nms_top_n (int)`   | Number of predicted boxes to keep after NMS operation at test time                                         | `1000` |
 
-## `PP-YOLO`
+
+## `PPYOLO`
 
 The PP-YOLO implementation based on PaddlePaddle.
 
 | Parameter Name                   | Description                                                        | Default Value |
 |----------------------------------|--------------------------------------------------------------------| --- |
 | `num_classes (int)`              | Number of target classes                                           | `80` |
-| `backbone (str)`                 | PPYOLO's backbone network                                          | `'ResNet50_vd_dcn'` |
-| `anchors (list[list[float]])`    | Size of predefined anchor boxes                                    | `None` |
+| `backbone (str)`                 | Backbone network to use                                            | `'ResNet50_vd_dcn'` |
+| `anchors (list[list[float]])`    | Sizes of predefined anchor boxes                                    | `None` |
 | `anchor_masks (list[list[int]])` | Masks for predefined anchor boxes                                  | `None` |
 | `use_coord_conv (bool)`          | Whether to use coordinate convolution                              | `True` |
 | `use_iou_aware (bool)`           | Whether to use IoU awareness                                       | `True` |
 | `use_spp (bool)`                 | Whether to use spatial pyramid pooling (SPP)                       | `True` |
-| `use_drop_block (bool)`          | Whether to use DropBlock regularization                            | `True` |
+| `use_drop_block (bool)`          | Whether to use DropBlock                            | `True` |
 | `scale_x_y (float)`              | Parameter to scale each predicted box                              | `1.05` |
 | `ignore_threshold (float)`       | IoU threshold used to assign predicted boxes to ground truth boxes | `0.7` |
 | `label_smooth (bool)`            | Whether to use label smoothing                                     | `False` |
-| `use_iou_loss (bool)`            | Whether to use IoU Loss                                            | `True` |
+| `use_iou_loss (bool)`            | Whether to use IoU loss                                            | `True` |
 | `use_matrix_nms (bool)`          | Whether to use Matrix NMS                                          | `True` |
 | `nms_score_threshold (float)`    | NMS score threshold                                                | `0.01` |
 | `nms_topk (int)`                 | Maximum number of detections to keep before performing NMS         | `-1` |
 | `nms_keep_topk (int)`            | Maximum number of prediction boxes to keep after NMS               | `100`|
 | `nms_iou_threshold (float)`      | NMS IoU threshold                                                  | `0.45` |
 
-##  `PP-YOLO Tiny`
+
+## `PPYOLOTiny`
 
 The PP-YOLO Tiny implementation based on PaddlePaddle.
 
 | Parameter Name                   | Description                                                 | Default Value |
 |----------------------------------|-------------------------------------------------------------| --- |
 | `num_classes (int)`              | Number of target classes                                    | `80` |
-| `backbone (str)`                 | Backbone network model name to use                          | `'MobileNetV3'` |
-| `anchors (list[list[float]])`    | List of anchor box sizes                                    | `[[10, 15], [24, 36], [72, 42], [35, 87], [102, 96] , [60, 170], [220, 125], [128, 222], [264, 266]]` |
-| `anchor_masks (list[list[int]])` | Anchor box mask                                             | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
-| `use_iou_aware (bool)`           | Boolean value indicating whether to use IoU-aware loss      | `False` |
-| `use_spp (bool)`                 | Boolean indicating whether to use the SPP module            | `True` |
-| `use_drop_block (bool)`          | Boolean value indicating whether to use the DropBlock block | `True` |
-| `scale_x_y (float)`              | Scaling parameter                                           | `1.05` |
-| `ignore_threshold (float)`       | Ignore threshold                                            | `0.5` |
-| `label_smooth (bool)`            | Boolean indicating whether to use label smoothing           | `False` |
-| `use_iou_loss (bool)`            | Boolean value indicating whether to use IoU Loss            | `True` |
-| `use_matrix_nms (bool)`          | Boolean indicating whether to use Matrix NMS                | `False` |
+| `backbone (str)`                 | Backbone network to use                                     | `'MobileNetV3'` |
+| `anchors (list[list[float]])`    | Sizes of predefined anchor boxes                                   | `[[10, 15], [24, 36], [72, 42], [35, 87], [102, 96] , [60, 170], [220, 125], [128, 222], [264, 266]]` |
+| `anchor_masks (list[list[int]])` | Masks for predefined anchor boxes                                             | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
+| `use_iou_aware (bool)`           | Whether to use IoU awareness      | `False` |
+| `use_spp (bool)`                 | Whether to use spatial pyramid pooling (SPP)            | `True` |
+| `use_drop_block (bool)`          | Whether to use the DropBlock | `True` |
+| `scale_x_y (float)`              | Parameter to scale each predicted box                                           | `1.05` |
+| `ignore_threshold (float)`       | IoU threshold used to assign predicted boxes to ground truth boxes                                            | `0.5` |
+| `label_smooth (bool)`            | Whether to use label smoothing           | `False` |
+| `use_iou_loss (bool)`            | Whether to use IoU loss            | `True` |
+| `use_matrix_nms (bool)`          | Whether to use Matrix NMS                | `False` |
 | `nms_score_threshold (float)`    | NMS score threshold                                         | `0.005` |
-| `nms_topk (int)`                 | Number of bounding boxes to keep before NMS operation       | `1000` |
-| `nms_keep_topk (int)`            | Number of bounding boxes to keep after NMS operation        | `100` |
+| `nms_topk (int)`                 | Maximum number of detections to keep before performing NMS       | `1000` |
+| `nms_keep_topk (int)`            | Maximum number of prediction boxes to keep after NMS        | `100` |
 | `nms_iou_threshold (float)`      | NMS IoU threshold                                           | `0.45` |
 
-## `PP-YOLOv2`
+
+## `PPYOLOv2`
 
 The PP-YOLOv2 implementation based on PaddlePaddle.
 
 | Parameter Name                   | Description | Default Value |
 |----------------------------------| --- | --- |
 | `num_classes (int)`              | Number of target classes | `80` |
-| `backbone (str)`                 | PPYOLO's backbone network | `'ResNet50_vd_dcn'` |
+| `backbone (str)`                 | Backbone network to use  | `'ResNet50_vd_dcn'` |
 | `anchors (list[list[float]])`    | Sizes of predefined anchor boxes| `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]]` |
 | `anchor_masks (list[list[int]])` | Masks of predefined anchor boxes | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
 | `use_iou_aware (bool)`           | Whether to use IoU awareness | `True` |
 | `use_spp (bool)`                 | Whether to use spatial pyramid pooling (SPP) | `True` |
-| `use_drop_block (bool)`          | Whether to use DropBlock regularization | `True` |
+| `use_drop_block (bool)`          | Whether to use DropBlock | `True` |
 | `scale_x_y (float)`              | Parameter to scale each predicted box | `1.05` |
 | `ignore_threshold (float)`       | IoU threshold used to assign predicted boxes to ground truth boxes | `0.7` |
 | `label_smooth (bool)`            | Whether to use label smoothing | `False` |
-| `use_iou_loss (bool)`            | Whether to use IoU Loss | `True` |
+| `use_iou_loss (bool)`            | Whether to use IoU loss | `True` |
 | `use_matrix_nms (bool)`          | Whether to use Matrix NMS | `True` |
 | `nms_score_threshold (float)`    | NMS score threshold | `0.01` |
 | `nms_topk (int)`                 | Maximum number of detections to keep before performing NMS | `-1` |
 | `nms_keep_topk (int)`            | Maximum number of prediction boxes to keep after NMS | `100`|
 | `nms_iou_threshold (float)`      | NMS IoU threshold | `0.45` |
 
+
 ## `YOLOv3`
 
 The YOLOv3 implementation based on PaddlePaddle.
@@ -387,17 +408,18 @@ The YOLOv3 implementation based on PaddlePaddle.
 | Parameter Name | Description                                                                                                                 | Default Value |
 | --- |-----------------------------------------------------------------------------------------------------------------------------| --- |
 | `num_classes (int)` | Number of target classes                                                                                                    | `80` |
-| `backbone (str)` | Name of the feature extraction network                                                                                      | `'MobileNetV1'` |
-| `anchors (list[list[int]])` | Sizes of all anchor boxes                                                                                                   | `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45 ], [59, 119], [116, 90], [156, 198], [373, 326]]` |
-| `anchor_masks (list[list[int]])` | Which anchor boxes to use to predict the target box                                                                         | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
-| `ignore_threshold (float)` | IoU threshold of the predicted box and the ground truth box, below which the threshold will be considered as the background | `0.7` |
-| `nms_score_threshold (float)` | In non-maximum suppression, score threshold below which boxes will be discarded                                             | `0.01` |
-| `nms_topk (int)` | In non-maximum value suppression, the maximum number of scoring boxes to keep, if it is -1, all boxes are kept              | `1000` |
-| `nms_keep_topk (int)` | In non-maximum value suppression, the maximum number of boxes to keep per image                                             | `100` |
-| `nms_iou_threshold (float)` | In non-maximum value suppression, IoU threshold, boxes larger than this threshold will be discarded                         | `0.45` |
-| `label_smooth (bool)` | Whether to use label smoothing when computing loss                                                                          | `False` |
-
-##  `BiSeNet V2`
+| `backbone (str)` | Backbone network to use                                                                                      | `'MobileNetV1'` |
+| `anchors (list[list[int]])` | Sizes of predefined anchor boxes                                                                                                   | `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45 ], [59, 119], [116, 90], [156, 198], [373, 326]]` |
+| `anchor_masks (list[list[int]])` | Masks of predefined anchor boxes                                                                         | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
+| `ignore_threshold (float)` | IoU threshold used to assign predicted boxes to ground truth boxes | `0.7` |
+| `nms_score_threshold (float)` | NMS score threshold                                             | `0.01` |
+| `nms_topk (int)` | Maximum number of detections to keep before performing NMS             | `1000` |
+| `nms_keep_topk (int)` | Maximum number of prediction boxes to keep after NMS                                            | `100` |
+| `nms_iou_threshold (float)` | NMS IoU threshold                         | `0.45` |
+| `label_smooth (bool)` | Whether to use label smoothing when computing losses                                                                          | `False` |
+
+
+## `BiSeNetV2`
 
 The BiSeNet V2 implementation based on PaddlePaddle.
 
@@ -409,7 +431,8 @@ The BiSeNet V2 implementation based on PaddlePaddle.
 | `losses (list)`         | List of loss functions | `{}`          |
 | `align_corners (bool)`  | Whether to use the corner alignment method  | `False`       |
 
-##  `DeepLab V3+`
+
+## `DeepLabV3P`
 
 The DeepLab V3+ implementation based on PaddlePaddle.
 
@@ -421,7 +444,7 @@ The DeepLab V3+ implementation based on PaddlePaddle.
 | `use_mixed_loss (bool)`    | Whether to use mixed loss function                                             | `False` |
 | `losses (list)`            | List of loss functions                                                         | `None` |
 | `output_stride (int)`      | Downsampling ratio of the output feature map relative to the input feature map | `8` |
-| `backbone_indices (tuple)` | Output the location indices of different stages of the backbone network        | `(0, 3)` |
+| `backbone_indices (tuple)` | Indices of different stages of the backbone network for use        | `(0, 3)` |
 | `aspp_ratios (tuple)`      | Dilation ratio of dilated convolution                                          | `(1, 12, 24, 36)` |
 | `aspp_out_channels (int)`  | Number of ASPP module output channels                                          | `256` |
 | `align_corners (bool)`     | Whether to use the corner alignment method                                     | `False` |
@@ -440,6 +463,7 @@ The original article refers to  A. Ma, J. Wang, Y. Zhong and Z. Zheng, "FactSeg:
 | `use_mixed_loss (bool)` | Whether to use mixed loss function                                                                                      | `False` |
 | `losses (list)`         | List of loss functions                                                                                | `None` |
 
+
 ## `FarSeg`
 
 The FarSeg implementation based on PaddlePaddle.
@@ -453,7 +477,8 @@ The original article refers to  Zheng Z, Zhong Y, Wang J, et al. Foreground-awar
 | `use_mixed_loss (bool)` | Whether to use mixed loss function                                                                                      | `False` |
 | `losses (list)`         | List of loss functions                                                                               | `None` |
 
-##  `Fast-SCNN`
+
+## `FastSCNN`
 
 The Fast-SCNN implementation based on PaddlePaddle.
 
@@ -466,7 +491,7 @@ The Fast-SCNN implementation based on PaddlePaddle.
 | `align_corners (bool)`  | Whether to use the corner alignment method     | `False`              |
 
 
-##  `HRNet`
+## `HRNet`
 
 The HRNet implementation based on PaddlePaddle.
 
@@ -474,7 +499,21 @@ The HRNet implementation based on PaddlePaddle.
 |-------------------------|------------------------------------------------------------------------------------------------------------------| --- |
 | `in_channels (int)`     | Number of channels of the input image                                                                                 | `3` |
 | `num_classes (int)`     | Number of target classes                                                                  | `2` |
-| `width (int)`           | Initial number of channels for the network                                                                       | `48` |
+| `width (int)`           | Initial number of feature channels for the network                                                                       | `48` |
+| `use_mixed_loss (bool)` | Whether to use mixed loss function                                                                               | `False` |
+| `losses (list)`         | List of loss functions                                                                                     | `None` |
+| `align_corners (bool)`  | Whether to use the corner alignment method                                                                       | `False` |
+
+
+## `UNet`
+
+The UNet implementation based on PaddlePaddle.
+
+| Parameter Name          | Description                                                                                                      | Default Value |
+|-------------------------|------------------------------------------------------------------------------------------------------------------| --- |
+| `in_channels (int)`     | Number of channels of the input image                                                                                 | `3` |
+| `num_classes (int)`     | Number of target classes                                                                  | `2` |
+| `use_deconv (int)`      | Whether to use deconvolution for upsampling                                                                       | `48` |
 | `use_mixed_loss (bool)` | Whether to use mixed loss function                                                                               | `False` |
 | `losses (list)`         | List of loss functions                                                                                     | `None` |
 | `align_corners (bool)`  | Whether to use the corner alignment method                                                                       | `False` |

+ 1 - 1
docs/intro/transforms_cn.md

@@ -34,7 +34,7 @@ PaddleRS对不同遥感任务需要的数据预处理/数据增强(合称为
 
 ## 组合算子
 
-在实际的模型训练过程中,常常需要组合多种数据预处理与数据增强策略。PaddleRS提供了`paddlers.transforms.Compose`以便捷地组合多个数据变换算子,使这些算子能够串行执行。关于`paddlers.transforms.Compose`的具体用法请参见[API说明](https://github.com/PaddlePaddle/PaddleRS/blob/develop/docs/apis/data_cn.md)。
+在实际的模型训练过程中,常常需要组合多种数据预处理与数据增强策略。PaddleRS提供了`paddlers.transforms.Compose`以便捷地组合多个数据变换算子,使这些算子能够串行执行。关于`paddlers.transforms.Compose`的具体用法请参见[API说明](../apis/data_cn.md)。
 
 ## 构造算子
 

+ 124 - 102
docs/intro/transforms_cons_params_cn.md

@@ -2,240 +2,262 @@
 
 # PaddleRS数据变换算子构造参数
 
-本文档详细介绍PaddleRS各数据变换算子的构造参数,包括算子名称、算子用途、各个算子的参数名称、参数类型、参数意义以及参数默认值。
+本文档介绍PaddleRS各数据变换算子的构造参数,包括算子名称、算子用途、各个算子的参数名称、参数类型、参数意义以及参数默认值。
 
-PaddleRS所支持的数据变换算子可见[此文档](https://github.com/PaddlePaddle/PaddleRS/blob/develop/docs/intro/transforms_cn.md)。
+PaddleRS所支持的数据变换算子可见[此文档](../intro/transforms_cn.md)。
 
 ## `AppendIndex`
 
 计算遥感指数并添加到输入影像中。
 
-| 参数名             | 描述                                                                                                                      | 默认值  |
+| 参数名称 (参数类型)             | 描述                                                                                                                      | 默认值  |
 |-----------------|-------------------------------------------------------------------------------------------------------------------------|------|
-|`index_type (str)`| 遥感指数类型。请在[此链接](https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/indices.py)查看PaddleRS支持的全部遥感指数类型。                |      |
-|`band_indexes (dict,可选)`| 波段名称到波段索引的映射(从1开始)。请在[此链接](https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/indices.py)查看PaddleRS支持的全部波段名称。              | `None` |
-|`satellite (str,可选)`| 卫星类型。设置后,将自动确定相应的带指数。请在[此链接](https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/satellites.py)查看PaddleRS支持的全部卫星类型。 | `None` |
+|`index_type (str)`| 遥感指数类型。请在[此链接](https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/indices.py)查看PaddleRS支持的全部遥感指数类型                |      |
+|`band_indexes (dict)`| 波段名称到波段索引的映射(从1开始)。请在[此链接](https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/indices.py)查看PaddleRS支持的全部波段名称              | `None` |
+|`satellite (str)`| 卫星类型。设置后,将自动确定波段名称与索引间的映射关系。请在[此链接](https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/satellites.py)查看PaddleRS支持的全部卫星类型 | `None` |
+
 
 ## `CenterCrop`
 
-+ 对输入影像进行中心裁剪。
-    - 1. 定位图像的中心。
-    - 2. 裁剪图像。
+对输入影像进行中心裁剪。
+1. 定位图像的中心;
+2. 裁剪图像。
 
-| 参数名             | 描述                                                                                                       | 默认值  |
+| 参数名称 (参数类型)             | 描述                                                                                                       | 默认值  |
 |-----------------|----------------------------------------------------------------------------------------------------------|------|
-|`crop_size (int,可选)`| 裁剪图像的目标大小  | `224`  |
+|`crop_size (int)`| 裁剪图像的目标大小  | `224`  |
+
 
 ## `Dehaze`
 
 对输入图像进行去雾。
 
-| 参数名             | 描述            | 默认值   |
+| 参数名称 (参数类型)             | 描述            | 默认值   |
 |-----------------|---------------|-------|
-|`gamma (bool,可选)`| 是否使用 gamma 校正 | `False` |
+|`gamma (bool)`| 是否使用 gamma 校正 | `False` |
+
 
 ## `MatchRadiance`
 
-对两个时相的输入影像进行相对辐射校正
+对两个时相的输入影像进行相对辐射校正
 
-| 参数名             | 描述                                     | 默认值   |
+| 参数名称 (参数类型)             | 描述                                     | 默认值   |
 |-----------------|-----------------------------------------------------|-------|
-|`method (str,可选)`| 用于匹配双时间图像亮度的方法。选项有{`'hist'`, `'lsr'`, `'fft`}。`'hist'`代表直方图匹配,`'lsr'`代表最小二乘回归,`'fft'`替换图像的低频分量以匹配参考图像  | `'hist'` |
+|`method (str)`| 相对辐射校正方法。可选项有{`'hist'`, `'lsr'`, `'fft`}。`'hist'`代表直方图匹配,`'lsr'`代表最小二乘回归,`'fft'`表示替换图像的低频分量以匹配参考图像  | `'hist'` |
+
 
 ## `MixupImage`
 
 将两幅影像(及对应的目标检测标注)混合在一起作为新的样本。
 
-| 参数名             | 描述                | 默认值 |
+| 参数名称 (参数类型)             | 描述                | 默认值 |
 |-----------------|-------------------|-----|
-|`alpha (float,可选)`| beta 分布的 alpha 参数 | `1.5` |
-|`beta (float,可选)` | beta 分布的 beta 参数  | `1.5` |
+|`alpha (float)`| beta 分布的 alpha 参数 | `1.5` |
+|`beta (float)` | beta 分布的 beta 参数  | `1.5` |
+
 
 ## `Normalize`
 
-对输入影像应用标准化
+对输入影像应用标准化
 
-+ 对输入图像应用归一化。归一化步骤如下:
-  - 1. Im = (Im - min_value) * 1 / (max_value - min_value)
-  - 2. Im = Im - mean
-  - 3. Im = Im / STD
+步骤如下:
+1. Im = (Im - min_value) * 1 / (max_value - min_value)
+2. Im = Im - mean
+3. Im = Im / STD
 
-| 参数名                 | 描述                               | 默认值                          |
+| 参数名称 (参数类型)                 | 描述                               | 默认值                          |
 |---------------------|----------------------------------|------------------------------|
-| `mean (list[float] \| tuple[float],可选)`    | 输入图像的均值                          | `[0.485,0.456,0.406]` |
-| `std (list[float] \| tuple[float],可选)`     | 输入图像的标准差                         | `[0.229,0.224,0.225]` |
-| `min_val (list[float] \| tuple[float],可选)` | 输入图像的最小值。如果为`None`,则对所有通道使用`0`   |    `None`      |
-| `max_val (list[float] \| tuple[float],可选)` | 输入图像的最大值。如果为`None`,则所有通道均使用`255` |  `None`        |
-| `apply_to_tar (bool,可选)` | 是否对目标图像应用数据变换算子                  | `True`                         |
+| `mean (list[float] \| tuple[float])`    | 输入图像的均值                          | `[0.485,0.456,0.406]` |
+| `std (list[float] \| tuple[float])`     | 输入图像的标准差                         | `[0.229,0.224,0.225]` |
+| `min_val (list[float] \| tuple[float])` | 输入图像的最小值。如果为`None`,则对所有通道均使用`0`   |    `None`      |
+| `max_val (list[float] \| tuple[float])` | 输入图像的最大值。如果为`None`,则对所有通道均使用`255` |  `None`        |
+| `apply_to_tar (bool)` | 是否对图像复原任务的目标图像应用数据变换算子                  | `True`                         |
+
 
 ## `Pad`
 
 将输入影像填充到指定的大小
 
-| 参数名                      | 描述                                                         | 默认值                |
+| 参数名称 (参数类型)                      | 描述                                                         | 默认值                |
 |--------------------------| ------------------------------------------------------------ | --------------------- |
-| `target_size (list[int] \| tuple[int],可选)`    | 图像目标大小                                                 | `None`                |
-| `pad_mode (int,可选)` | 填充模式。目前只支持四种模式:[-1,0,1,2]。如果是`-1`,使用指定的偏移量。若为`0`,只向右和底部填充;若为`1`,按中心填充。如果`2`,只填充左侧和顶部 | `0`                   |
-| `offset (list[int] \| None,可选)`                | 填充偏移量                                                   | `None`                |
+| `target_size (list[int] \| tuple[int])`    | 填充后的图像尺寸                                                 | `None`                |
+| `pad_mode (int)` | 填充模式。目前只支持四种模式:[-1,0,1,2]。如果是`-1`,使用指定的偏移量;若为`0`,只向右和底部填充;若为`1`,按中心填充;若为`2`,只填充左侧和顶部 | `0`                   |
+| `offset (list[int] \| None)`                | 填充偏移量                                                   | `None`                |
 | `im_padding_value (list[float] \| tuple[float])` | 填充区域的 RGB 值                                            | `(127.5,127.5,127.5)` |
-| `label_padding_value (int,可选)` | 掩码的填充值                                                 | `255`                 |
+| `label_padding_value (int)` | 掩码的填充值                                                 | `255`                 |
 | `size_divisor (int)`     | 填充后的图像宽度和高度将是`'size_divisor'`的倍数             |                       |
 
+
 ## `RandomBlur`
 
 对输入施加随机模糊
 
-| 参数名             | 描述                                     | 默认值  |
+| 参数名称 (参数类型)             | 描述                                     | 默认值  |
 |-----------------|-----------------------------------------------------|------|
-|`prob (float)`| 模糊的概率 |      |
+|`prob (float)`| 施加模糊的概率 |      |
 
-## `RandomCrop`
 
-对输入影像进行随机中心裁剪
+## `RandomCrop`
 
-+ 随机裁剪输入
+对输入影像进行随机中心裁剪
 
-  1. 根据' aspect_ratio '和' scaling '计算裁剪区域的高度和宽度。
-  - 2. 随机定位裁剪区域的左上角。
-  - 3. 裁剪图像。
-  - 4. 调整裁剪区域的大小为' crop_size ' x ' crop_size '
+1. 根据`aspect_ratio`和`scaling`计算裁剪区域的高度和宽度;
+2. 随机选取裁剪区域的左上角;
+3. 裁剪图像;
+4. 调整裁剪区域的大小为`crop_size` x `crop_size`
 
-| 参数名              | 描述         | 默认值                     |
+| 参数名称 (参数类型)              | 描述         | 默认值                     |
 |------------------|------------|-------------------------|
-| `crop_size (int \| list[int] \| tuple[int])` | 裁剪区域的目标大小。如果为`None`,裁剪区域将不会被调整大小 | `None`                    |
-| `aspect_ratio (list[float],可选)` | 以[min, max]格式显示裁剪区域的纵横比 | `[.5, 2.]`                |
-| `thresholds (list[float],可选)` | IoU 阈值,用于决定有效的 bbox 裁剪 | `[.0,.1, .3, .5, .7, .9]` |
-| `scaling (list[float], 可选)` | 裁剪区域与原始图像之间的比例,格式为[min, max] | `[.3, 1.]`                |
-| `num_attempts (int,可选)` | 放弃前的最大尝试次数 | `50`                      |
-| `allow_no_crop (bool,可选)` | 是否允许不进行裁剪而返回 | `True`                    |
-| `cover_all_box (bool,可选)` | 是否强制覆盖整个目标框 | `False`                   |
+| `crop_size (int \| list[int] \| tuple[int])` | 裁剪区域大小。如果为`None`,裁剪区域将不会被调整大小 | `None`                    |
+| `aspect_ratio (list[float])` | 以[min, max]格式设置裁剪区域纵横比的取值范围 | `[.5, 2.]`                |
+| `thresholds (list[float])` | IoU 阈值,用于决定有效的 bbox 裁剪 | `[.0,.1, .3, .5, .7, .9]` |
+| `scaling (list[float])` | 裁剪区域与原始图像之间的尺寸比例,格式为[min, max] | `[.3, 1.]`                |
+| `num_attempts (int)` | 放弃前的最大尝试次数 | `50`                      |
+| `allow_no_crop (bool)` | 是否允许不进行裁剪而返回 | `True`                    |
+| `cover_all_box (bool)` | 是否强制覆盖整个目标框 | `False`                   |
+
 
 ## `RandomDistort`
 
-| 参数名                       | 描述                          | 默认值   |
+随机施加色彩失真。
+
+| 参数名称 (参数类型)                       | 描述                          | 默认值   |
 |---------------------------|-----------------------------|-------|
-| `brightness_range (float,可选)` | 亮度失真范围                      | `.5`    |
-| `brightness_prob (float,可选)` | 亮度失真的概率                     | `.5`    |
-| `contrast_range (float, 可选)` | 对比度失真范围                     | `.5`    |
-| `contrast_prob (float, 可选)` | 对比度失真的概率                    | `.5`    |
-| `saturation_range (float,可选)` | 饱和失真范围                      | `.5`    |
-| `saturation_prob (float,可选)` | 饱和失真的概率                    | `.5`    |
-| `hue_range (float,可选)` | 色调失真范围                      | `.5`    |
-| `hue_prob (float,可选)`| 色相失真的概率                     | `.5`    |
-| `random_apply (bool,可选)` | 以随机( Yolo )或固定( SSD )顺序应用转换 | `True`  |
-| `count (int,可选)`  | 用于控制扭曲次数          | `4`     |
-| `shuffle_channel (bool,可选)` | 是否随机交换通道                    | `False` |
+| `brightness_range (float)` | 亮度失真范围                      | `.5`    |
+| `brightness_prob (float)` | 施加亮度失真的概率                     | `.5`    |
+| `contrast_range (float)` | 对比度失真范围                     | `.5`    |
+| `contrast_prob (float)` | 施加对比度失真的概率                    | `.5`    |
+| `saturation_range (float)` | 饱和度失真范围                      | `.5`    |
+| `saturation_prob (float)` | 施加饱和度失真的概率                    | `.5`    |
+| `hue_range (float)` | 色相失真范围                      | `.5`    |
+| `hue_prob (float)`| 施加色相失真的概率                     | `.5`    |
+| `random_apply (bool)` | 以随机(YOLO)或固定(SSD)顺序应用数据变换算子 | `True`  |
+| `count (int)`  | 用于控制扭曲次数          | `4`     |
+| `shuffle_channel (bool)` | 是否随机交换通道                    | `False` |
+
 
 ## `RandomExpand`
 
 根据随机偏移扩展输入影像。
 
-| 参数名                             | 描述           | 默认值                 |
+| 参数名称 (参数类型)                             | 描述           | 默认值                 |
 |---------------------------------|--------------|---------------------|
-| `upper_ratio (float,可选)`        | 原始图像扩展到的最大比例 | `4`                   |
-| `prob (float,可选)`               | 应用扩展的概率      | `.5`                  |
-| `im_padding_value (list[float] \| tuple[float],可选)` | 图像的 RGB 填充值  | `(127.5,127.5,127.5)` |
-| `label_padding_value (int,可选)`  | 掩码的填充值       | `255`    |
+| `upper_ratio (float)`        | 原始图像扩展到的最大比例 | `4`                   |
+| `prob (float)`               | 施加扩展的概率      | `.5`                  |
+| `im_padding_value (list[float] \| tuple[float])` | 图像的 RGB 填充值  | `(127.5,127.5,127.5)` |
+| `label_padding_value (int)`  | 掩码的填充值       | `255`    |
+
 
 ## `RandomHorizontalFlip`
 
 随机水平翻转输入影像。
 
-| 参数名                                              | 描述        | 默认值                 |
+| 参数名称 (参数类型)                                              | 描述        | 默认值                 |
 |--------------------------------------------------|-----------|---------------------|
-| `prob (float,可选)`                           | 翻转输入的概率   | `.5`                  |
+| `prob (float)`                           | 翻转输入的概率   | `.5`                  |
+
 
 ## `RandomResize`
 
 随机调整输入影像大小。
 
-| 参数名                       | 描述                                                         | 默认值     |
+| 参数名称 (参数类型)                       | 描述                                                         | 默认值     |
 |---------------------------| ------------------------------------------------------------ | ---------- |
-| `Target_sizes (list[int] \| list[list\|tuple] \| tuple[list \| tuple])` | 多个目标大小,每个目标大小应该是`int`、`list`或`tuple`       |            |
-| `interp (str,可选)`         | 调整图像大小的插值方法。{`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}之一 | `'LINEAR'` |
+| `target_sizes (list[int] \| list[list\|tuple] \| tuple[list \| tuple])` | 一组缩放后图像尺寸候选值,每个值可指定为`int`、`list`或`tuple`       |            |
+| `interp (str)`         | 调整图像大小的插值方法。{`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}之一 | `'LINEAR'` |
+
 
 ## `RandomResizeByShort`
 
 随机调整输入影像大小,保持纵横比不变(根据短边计算缩放系数)。
 
-| 参数名                       | 描述        | 默认值   |
+| 参数名称 (参数类型)                       | 描述        | 默认值   |
 |---------------------------|-----------|-------|
-| `short_sizes (list[int])` | 图像较短一侧的目标大小|       |
-| `max_size (int,可选)`       |图像长边的上界。如果`'max_size'`为`-1`,则不应用上限   | `-1`  |
-| `interp (str,可选)`         | 调整图像大小的插值方法。{`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}之一         | `'LINEAR'` |
+| `short_sizes (list[int])` | 缩放后图像短边长度。指定一组候选值 |       |
+| `max_size (int)`       | 图像长边的上界。如果`'max_size'`为`-1`,则不设置上限   | `-1`  |
+| `interp (str)`         | 调整图像大小的插值方法。{`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}之一         | `'LINEAR'` |
+
 
 ## `RandomScaleAspect`
 
 裁剪输入影像并重新缩放到原始尺寸。
 
-| 参数名                                                               | 描述        | 默认值    |
+| 参数名称 (参数类型)                                                               | 描述        | 默认值    |
 |-------------------------------------------------------------------|-----------|--------|
 | `min_scale (float)`| 裁剪区域与原始图像之间的最小比例。如果为`0`,图像将不会被裁剪| `0`     |
 | `aspect_ratio (float)`    |裁剪区域的纵横比  | `.33`    |
 
+
 ## `RandomSwap`
 
 随机交换两个时相的输入影像。
 
-| 参数名                                                               | 描述        | 默认值 |
+| 参数名称 (参数类型)                                                               | 描述        | 默认值 |
 |-------------------------------------------------------------------|-----------|-----|
-|`prob (float,可选)`| 交换输入图像的概率 | `0.2` |
+|`prob (float)`| 交换输入图像的概率 | `0.2` |
+
 
 ## `RandomVerticalFlip`
 
 随机竖直翻转输入影像。
 
-| 参数名                                                              | 描述        | 默认值 |
+| 参数名称 (参数类型)                                                              | 描述        | 默认值 |
 |------------------------------------------------------------------|-----------|-----|
-|`prob (float,可选)`| 翻转输入的概率| `.5`  |
+|`prob (float)`| 翻转输入的概率| `.5`  |
+
 
 ## `ReduceDim`
 
 对输入图像进行波段降维。
 
-| 参数名                                                               | 描述             | 默认值  |
+| 参数名称 (参数类型)                                                               | 描述             | 默认值  |
 |-------------------------------------------------------------------|----------------|------|
 |`joblib_path (str)`| *.joblib 文件的路径 |      |
-|`apply_to_tar (bool,可选)` | 是否对目标图像应用数据变换算子 | `True` |
+|`apply_to_tar (bool)` | 是否对图像复原任务的目标图像应用数据变换算子 | `True` |
+
 
 ## `Resize`
 
 调整输入影像大小。
 
-    -如果' target_size '是int,将图像大小调整为(' target_size ', ' target_size ')`。
-    -如果' target_size '是一个列表或元组,将图像大小调整为' target_size '。
-    注意:如果' interp '为'RANDOM',则插值方法将随机选择。
+- 如果`target_size`是int,将图像大小调整为`target_size` x `target_size`。
+- 如果`target_size`是一个列表或元组,将图像大小调整为`target_size`。
+
+注意:如果`interp`为`'RANDOM'`,则插值方法将随机选择。
 
-| 参数名                | 描述         | 默认值      |
+| 参数名称 (参数类型)                | 描述         | 默认值      |
 |--------------------|------------|----------|
-| `target_size (int \| list[int] \| tuple[int])` |目标大小。如果它是一个整数,目标高度和宽度都将被设置为`'target_size'`。否则,`'target_size'`表示[目标高度,目标宽度]|          |
-| `interp (str,可选)`  | 调整图像大小的插值方法。{`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}之一 | `'LINEAR'` |
-| `keep_ratio (bool,可选)` | 如果为`True`,宽度和高度的比例因子将被设置为相同的值,调整图像的高度/宽度将不大于目标宽度/高度 | `False`    |
+| `target_size (int \| list[int] \| tuple[int])` |缩放后图像尺寸。如果为整数,则高度和宽度都将被设置为`'target_size'`。否则,`'target_size'`表示[高度,宽度]|          |
+| `interp (str)`  | 调整图像大小的插值方法。{`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}之一 | `'LINEAR'` |
+| `keep_ratio (bool)` | 如果为`True`,宽度和高度的缩放因子将被设置为相同的值,且缩放后图像的高宽比将不大于由`target_size`计算得到的高宽比 | `False`    |
+
 
 ## `ResizeByLong`
 
 调整输入影像大小,保持纵横比不变(根据长边计算缩放系数)。
 
-| 参数名                                        | 描述        | 默认值      |
+| 参数名称 (参数类型)                                        | 描述        | 默认值      |
 |--------------------------------------------|-----------|----------|
-| `long_size (int)`|图像较长一侧的目标大小|          |
-| `interp (str,可选)`                    | 调整图像大小的插值方法。{`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}之一   | `'LINEAR'` |
+| `long_size (int)`|缩放后图像长边长度|          |
+| `interp (str)`                    | 调整图像大小的插值方法。{`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}之一   | `'LINEAR'` |
+
 
 ## `ResizeByShort`
 
 调整输入影像大小,保持纵横比不变(根据短边计算缩放系数)。
 
-| 参数名                   | 描述        | 默认值      |
+| 参数名称 (参数类型)                   | 描述        | 默认值      |
 |-----------------------|-----------|----------|
-| `short_size (int)`    |图像较短一侧的目标大小|          |
-| `mamax_size (int,可选)` | 图像长边的上界。如果`'max_size'`为`-1`,则不应用上限  | `-1`       |
-| `interp (str,可选)`      | 调整图像大小的插值方法。{`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}之一 | `'LINEAR'` |
+| `short_size (int)`    |缩放后图像短边长度|          |
+| `max_size (int)` | 图像长边的上界。如果`'max_size'`为`-1`,则不设置上限  | `-1`       |
+| `interp (str)`      | 调整图像大小的插值方法。{`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}之一 | `'LINEAR'` |
+
 
 ## `SelectBand`
 
 对输入影像进行波段选择。
 
-| 参数名              | 描述               | 默认值      |
+| 参数名称 (参数类型)              | 描述               | 默认值      |
 |------------------|------------------|----------|
-| `band_list (list,可选)` | 要选择的波段(波段索引从1开始) | `[1,2,3]`  |
-| `apply_to_tar (bool,可选)`| 是否将转换应用到目标图像     | `True`     |
+| `band_list (list)` | 要选择的波段(波段索引从1开始) | `[1, 2, 3]`  |
+| `apply_to_tar (bool)`| 是否对图像复原任务的目标图像应用数据变换算子     | `True`     |

+ 129 - 116
docs/intro/transforms_cons_params_en.md

@@ -2,257 +2,270 @@
 
 # PaddleRS Data Transformation Operator Construction Parameters
 
-This document describes the parameters of each PaddleRS data transformation operator in detail, including the operator name, operator purpose, parameter name, parameter type, parameter meaning, and parameter default value of each operator.
+This document describes the parameters of each PaddleRS data transformation operator, including the operator name, operator purpose, parameter name, parameter type, parameter meaning, and parameter default value of each operator.
 
-You can check all data transformation operators supported by PaddleRS [here](https://github.com/PaddlePaddle/PaddleRS/blob/develop/docs/intro/transforms_en.md).
+You can check all data transformation operators supported by PaddleRS [here](../intro/transforms_en.md).
 
 ## `AppendIndex`
 
 Append remote sensing index to input image(s).
 
-| Parameter Name             | Description                                                                                                                                        | Default Value       |
+| Parameter Name (Parameter Type)             | Description                                                                                                                                        | Default Value       |
 |-----------------|----------------------------------------------------------------------------------------------------------------------------------------------------|-----------|
-|`index_type (str)`| Type of remote sensinng index. See supported index types in https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/indices.py . |           |
-|`band_indexes (dict, optional)`|Mapping of band names to band indices (starting from 1). See supported band names in  https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/indices.py                                         | `None`      |
-|`satellite (str, optional)`|Type of satellite. If set, band indices will be automatically determined accordingly. See supported satellites in https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/satellites.py                             | `None`      |
+|`index_type (str)`| Type of remote sensinng index. See supported index types in https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/indices.py |           |
+|`band_indexes (dict)`|Mapping of band names to band indices (starting from 1). See supported band names in  https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/indices.py                                         | `None`      |
+|`satellite (str)`|Type of satellite. If set, band indices will be automatically determined accordingly. See supported satellites in https://github.com/PaddlePaddle/PaddleRS/tree/develop/paddlers/transforms/satellites.py                             | `None`      |
 
 
 ## `CenterCrop`
 
-+ Crop the input image(s) at the center.
-  - 1. Locate the center of the image.
-  - 2. Crop the image.
+Crop the input image(s) at the center.
 
+1. Locate the center of the image.
+2. Crop the image.
 
-| Parameter Name             | Description                                                                                                       | Default Value  |
+| Parameter Name (Parameter Type)             | Description                                                                                                       | Default Value  |
 |-----------------|----------------------------------------------------------------------------------------------------------|------|
-|`crop_size (int, optional)`| Target size of the cropped image(s)  | `224`  |
+|`crop_size (int)`| Target size of the cropped image(s)  | `224`  |
 
-## `Dehaze`
 
- Dehaze input image(s)
+## `Dehaze`
 
+Dehaze input image(s)
 
-| Parameter Name             | Description                                   | Default Value   |
+| Parameter Name (Parameter Type)             | Description                                   | Default Value   |
 |-----------------|---------------------------------------------------|-------|
-|`gamma (bool, optional)`| Use gamma correction or not  | `False` |
+|`gamma (bool)`| Use gamma correction or not  | `False` |
+
 
 ## `MatchRadiance`
 
 Perform relative radiometric correction between bi-temporal images.
 
-| Parameter Name             | Description                                                                                                                                                                                                                                                                 | Default Value |
+| Parameter Name (Parameter Type)             | Description                                                                                                                                                                                                                                                                 | Default Value |
 |-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
-|`method (str, optional)`| Method used to match the radiance of the bi-temporal images. Choices are {`'hist'`, `'lsr'`, `'fft'`}. `'hist'` stands for histogram matching, `'lsr'` stands for least-squares regression, and `'fft'` replaces the low-frequency components of the image to match the reference image. | `'hist'` |
+|`method (str)`| Method used to match the radiance of the bi-temporal images. Choices are {`'hist'`, `'lsr'`, `'fft'`}. `'hist'` stands for histogram matching, `'lsr'` stands for least-squares regression, and `'fft'` replaces the low-frequency components of the image to match the reference image | `'hist'` |
 
 
 ## `MixupImage`
 
-Mixup two images and their gt_bbbox/gt_score.
+Mixup two images and their gt_bbox/gt_score.
 
-| Parameter Name             | Description                                     | Default Value |
+| Parameter Name (Parameter Type)             | Description                                     | Default Value |
 |-----------------|-----------------------------------------------------|-----|
-|`alpha (float, optional)`| Alpha parameter of beta distribution. | `1.5` |
-|`beta (float, optional)` |Beta parameter of beta distribution. | `1.5` |
+|`alpha (float)`|Alpha parameter of beta distribution | `1.5` |
+|`beta (float)` |Beta parameter of beta distribution | `1.5` |
 
-## `Normalize`
 
-+ Apply normalization to the input image(s). The normalization steps are:
+## `Normalize`
 
-  - 1. im = (im - min_value) * 1 / (max_value - min_value)
-  - 2. im = im - mean
-  - 3. im = im / std
+Apply normalization to the input image(s). The normalization steps are:
 
+1. im = (im - min_value) * 1 / (max_value - min_value)
+2. im = im - mean
+3. im = im / std
 
-| Parameter Name      | Description                                                              | Default Value                          |
+| Parameter Name (Parameter Type)      | Description                                                              | Default Value                          |
 |---------------------|--------------------------------------------------------------------------|------------------------------|
-| `mean (list[float] \| tuple[float], optional)`  | Mean of input image(s)                                                   | `[0.485,0.456,0.406]` |
-| `std (list[float] \| tuple[float], optional)`   | Standard deviation of input image(s)                                     | `[0.229,0.224,0.225]` |
-| `min_val (list[float] \| tuple[float], optional)` | Inimum value of input image(s). If `None`, use `0` for all channels.     |    `None`      |
-| `max_val (list[float] \| tuple[float], optional)` | Maximum value of input image(s). If `None`, use `255`. for all channels. |  `None`        |
-| `apply_to_tar (bool, optional)` \| Whether to apply transformation to the target image                      | `True`                         |
+| `mean (list[float] \| tuple[float])`  | Mean of input image(s)                                                   | `[0.485,0.456,0.406]` |
+| `std (list[float] \| tuple[float])`   | Standard deviation of input image(s)                                     | `[0.229,0.224,0.225]` |
+| `min_val (list[float] \| tuple[float])` | Inimum value of input image(s). If `None`, use `0` for all channels     |    `None`      |
+| `max_val (list[float] \| tuple[float])` | Maximum value of input image(s). If `None`, use `255` for all channels |  `None`        |
+| `apply_to_tar (bool)` \| Whether to apply transformation to the target image                      | `True`                         |
+
 
 ## `Pad`
 
 Pad image to a specified size or multiple of `size_divisor`.
 
-| Parameter Name           | Description                                                                                                                                                                                                          | Default Value              |
+| Parameter Name (Parameter Type)           | Description                                                                                                                                                                                                          | Default Value              |
 |--------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|
-| `target_size (list[int] \| tuple[int], optional)`     | Image target size, if `None`, pad to multiple of size_divisor.                                                                                                                                                         | `None`               |
-| `pad_mode (int, optional)` | Currently only four modes are supported:[-1, 0, 1, 2]. if `-1`, use specified offsets. If `0`, only pad to right and bottom If `1`, pad according to center. If `2`, only pad left and top.   | `0`                  |
-| `offset (list[int] \| None, optional)`                |  Padding offsets.                                                                                                                                                                                                              | `None`               |
-| `im_padding_value (list[float] \| tuple[float])` | RGB value of padded area.                                                                                                                                                                                                        | `(127.5,127.5,127.5)` |
-| `label_padding_value (int, optional)` |Filling value for the mask.                                                                                                                                                                                                              | `255`                  |
-| `size_divisor (int)`     | Image width and height after padding will be a multiple of `size_divisor`.                                                                                                                                                                       |                      |
+| `target_size (list[int] \| tuple[int])`     | Image target size, if `None`, pad to multiple of size_divisor                                                                                                                                                        | `None`               |
+| `pad_mode (int)` | Currently only four modes are supported:[-1, 0, 1, 2]. if `-1`, use specified offsets. If `0`, only pad to right and bottom. If `1`, pad according to center. If `2`, only pad left and top   | `0`                  |
+| `offset (list[int] \| None)`                |  Padding offsets                                                                                                                                                                                                              | `None`               |
+| `im_padding_value (list[float] \| tuple[float])` | RGB value of padded area                                                                                                                                                                                                        | `(127.5,127.5,127.5)` |
+| `label_padding_value (int)` |Filling value for the mask                                                                                                                                                                                                              | `255`                  |
+| `size_divisor (int)`     | Image width and height after padding will be a multiple of `size_divisor`                                                                                                                                                                       |                      |
+
 
 ## `RandomBlur`
 
 Randomly blur input image(s).
 
-| Parameter Name             | Description                                     | Default Value  |
+| Parameter Name (Parameter Type)             | Description                                     | Default Value  |
 |-----------------|-----------------------------------------------------|------|
-|`probb (float)`|Probability of blurring. |      |
+|`prob (float)`|Probability of blurring |      |
+
 
 ## `RandomCrop`
 
-+ Randomly crop the input.
+Randomly crop the input.
 
-  - 1. Compute the height and width of cropped area according to `aspect_ratio` and
-          `scaling`.
-  - 2. Locate the upper left corner of cropped area randomly.
-  - 3. Crop the image(s).
-  - 4. Resize the cropped area to `crop_size` x `crop_size`.
+1. Compute the height and width of cropped area according to `aspect_ratio` and`scaling`.
+2. Locate the upper left corner of cropped area randomly.
+3. Crop the image(s).
+4. Resize the cropped area to `crop_size` x `crop_size`.
 
-| Parameter Name   | Description                                                                   | Default Value                     |
+| Parameter Name (Parameter Type)   | Description                                                                   | Default Value                     |
 |------------------|-------------------------------------------------------------------------------|-------------------------|
-| `crop_size (int \| list[int] \| tuple[int])` | Target size of the cropped area. If `None`, the cropped area will not be resized. | `None`                    |
-| `aspect_ratio (list[float], optional)` | Aspect ratio of cropped region in [min, max] format.                          | `[.5, 2.]`                |
-| `thresholds (list[float], optional)` | IoU thresholds to decide a valid bbox crop.                                   | `[.0,.1,  .3,  .5,  .7,  .9]` |
-| `scaling (list[float], optional)` | Ratio between the cropped region and the original image in [min, max] format. | `[.3, 1.]`                |
-| `num_attempts (int, optional)` | Max number of tries before giving up.                                         | `50`                      |
-| `allow_no_crop (bool, optional)` | Whether returning without doing crop is allowed.                              | `True`                    |
-| `cover_all_box (bool, optional)` | Whether to force to cover the entire target box.                              | `False`                   |
+| `crop_size (int \| list[int] \| tuple[int])` | Target size of the cropped area. If `None`, the cropped area will not be resized | `None`                    |
+| `aspect_ratio (list[float])` | Aspect ratio of cropped region in [min, max] format                          | `[.5, 2.]`                |
+| `thresholds (list[float])` | IoU thresholds to decide a valid bbox crop                                   | `[.0,.1,  .3,  .5,  .7,  .9]` |
+| `scaling (list[float])` | Ratio between the cropped region and the original image in [min, max] format | `[.3, 1.]`                |
+| `num_attempts (int)` | Max number of tries before giving up                                         | `50`                      |
+| `allow_no_crop (bool)` | Whether returning without doing crop is allowed                              | `True`                    |
+| `cover_all_box (bool)` | Whether to force to cover the entire target box                              | `False`                   |
+
 
 ## `RandomDistort`
 
 Random color distortion.
 
-| Parameter Name                       | Description                                                     | Default Value   |
+| Parameter Name (Parameter Type)                       | Description                                                     | Default Value   |
 |----------------------------|-----------------------------------------------------------------|-------|
-| `brightness_range (float, optional)` | Range of brightness distortion.                                 | `.5`    |
-| `brightness_prob (float, optional)` | Probability of brightness distortion.                           | `.5`    |
-| `contrast_range (float, optional)` | Range of contrast distortion.                                   | `.5`    |
-| `contrast_prob (float, optional)` | Probability of contrast distortion.                             | `.5`    |
-| `saturation_range (float,optional)` | Range of saturation distortion.                                 | `.5`    |
-| `saturation_prob (float, optional)` | Probability of saturation distortion.                           | `.5`    |
-| `hue_range (float, optional)` | Range of hue distortion.                                        | `.5`    |
-| `hue_probb (float, optional)`| Probability of hue distortion.                                  | `.5`    |
-| `random_apply (bool, optional)` | Apply the transformation in random (yolo) or fixed (SSD) order. | `True`  |
-| `count (int, optional)`  | Count used to control the distortion                | `4`     |
-| `shuffle_channel (bool, optional)` | Whether to swap channels randomly.                                           | `False` |
+| `brightness_range (float)` | Range of brightness distortion                                 | `.5`    |
+| `brightness_prob (float)` | Probability of brightness distortion                           | `.5`    |
+| `contrast_range (float)` | Range of contrast distortion                                   | `.5`    |
+| `contrast_prob (float)` | Probability of contrast distortion                             | `.5`    |
+| `saturation_range (float,optional)` | Range of saturation distortion                                 | `.5`    |
+| `saturation_prob (float)` | Probability of saturation distortion                           | `.5`    |
+| `hue_range (float)` | Range of hue distortion                                        | `.5`    |
+| `hue_prob (float)`| Probability of hue distortion                                  | `.5`    |
+| `random_apply (bool)` | Apply the transformation in random (YOLO) or fixed (SSD) order | `True`  |
+| `count (int)`  | Count used to control the distortion                | `4`     |
+| `shuffle_channel (bool)` | Whether to permute channels randomly                                           | `False` |
 
 
 ## `RandomExpand`
 
 Randomly expand the input by padding according to random offsets.
 
-| Parameter Name                  | Description                                    | Default Value                 |
+| Parameter Name (Parameter Type)                  | Description                                    | Default Value                 |
 |---------------------------------|----------------------------------------------------|---------------------|
-| `upper_ratio (float, optional)`  | Maximum ratio to which the original image is expanded. | `4`                   |
-| `probb (float, optional)`        |Probability of apply expanding. | `.5`                  |
-| `im_padding_value (list[float] \| tuple[float], optional)` |  RGB filling value for the image  | `(127.5,127.5,127.5)` |
-| `label_padding_value (int, optional)` | Filling value for the mask.  | `255`    |
+| `upper_ratio (float)`  | Maximum ratio to which the original image is expanded | `4`                   |
+| `prob (float)`        |Probability of expanding | `.5`                  |
+| `im_padding_value (list[float] \| tuple[float])` |  RGB filling value for the image  | `(127.5,127.5,127.5)` |
+| `label_padding_value (int)` | Filling value for the mask  | `255`    |
+
 
 ## `RandomHorizontalFlip`
 
 Randomly flip the input horizontally.
 
-| Parameter Name                                              | Description        | Default Value                |
+| Parameter Name (Parameter Type)                                              | Description        | Default Value                |
 |--------------------------------------------------|-----------|---------------------|
-| `probb (float, optional)`                           | Probability of flipping the input   | `.5`                  |
+| `prob (float)`                           | Probability of flipping the input   | `.5`                  |
+
 
 ## `RandomResize`
 
 Resize input to random sizes.
 
-+ Attention: If `interp` is 'RANDOM', the interpolation method will be chosen randomly.
+Attention: If `interp` is 'RANDOM', the interpolation method will be chosen randomly.
 
-| Parameter Name            | Description                                                          | Default Value                 |
+| Parameter Name (Parameter Type)            | Description                                                          | Default Value                 |
 |---------------------------|----------------------------------------------------------------------|---------------------|
-| `Target_sizes (list[int] \| list[list \| tuple] \| tuple [list \| tuple])` | Multiple target sizes, each of which should be int, list, or tuple.  | `.5`                  |
-| `interp (str, optional)`   | Interpolation method for resizing image(s). One of {`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}. |   `'LINEAR'`                  ||
+| `Target_sizes (list[int] \| list[list \| tuple] \| tuple [list \| tuple])` | Multiple target sizes, each of which should be int, list, or tuple  | `.5`                  |
+| `interp (str)`   | Interpolation method for resizing image(s). One of {`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`} |   `'LINEAR'`                  ||
 
 
 ## `RandomResizeByShort`
 
 Resize input to random sizes while keeping the aspect ratio.
 
-+ Attention: If `interp` is 'RANDOM', the interpolation method will be chosen randomly.
+Attention: If `interp` is `'RANDOM'`, the interpolation method will be chosen randomly.
 
-| Parameter Name     | Description        | Default Value |
+| Parameter Name (Parameter Type)     | Description        | Default Value |
 |--------------------|-----------|-----|
-| `short_sizes (int \| list[int])` | Target size of the shorter side of the image(s).| `.5`  |
-| `max_size (int, optional)` |Upper bound of longer side of the image(s). If `max_size` is -1, no upper bound will be applied.    | `-1`  |
-| `interp (str, optional)` |  Interpolation method for resizing image(s). One of {'`NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}.  | `'LINEAR'`    |
+| `short_sizes (int \| list[int])` | Target size of the shorter side of the image(s)| `.5`  |
+| `max_size (int)` |Upper bound of longer side of the image(s). If `max_size` is -1, no upper bound will be applied    | `-1`  |
+| `interp (str)` |  Interpolation method for resizing image(s). One of {'`NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}  | `'LINEAR'`    |
+
 
 ## `RandomScaleAspect`
 
 Crop input image(s) and resize back to original sizes.
 
-
-| Parameter Name                                                               | Description                                                                                          | Default Value    |
+| Parameter Name (Parameter Type)                                                               | Description                                                                                          | Default Value    |
 |-------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|--------|
-| `min_scale (float)`| Minimum ratio between the cropped region and the original image. If `0`, image(s) will not be cropped. | `0`      |
-| `aspect_ratio (float)`    | Aspect ratio of cropped region.                                                                                 | `.33`    |
+| `min_scale (float)`| Minimum ratio between the cropped region and the original image. If `0`, image(s) will not be cropped | `0`      |
+| `aspect_ratio (float)`    | Aspect ratio of cropped region                                                                                 | `.33`    |
+
 
 ## `RandomSwap`
 
 Randomly swap multi-temporal images.
 
-
-| Parameter Name                                                               | Description        | Default Value |
+| Parameter Name (Parameter Type)                                                               | Description        | Default Value |
 |-------------------------------------------------------------------|-----------|-----|
-|`probb (float, optional)`| Probability of swapping the input images.| `0.2` |
+|`prob (float)`| Probability of swapping the input images| `0.2` |
+
 
 ## `RandomVerticalFlip`
-Randomly flip the input vertically.
 
+Randomly flip the input vertically.
 
-| Parameter Name                                                              | Description        | Default Value |
+| Parameter Name (Parameter Type)                                                              | Description        | Default Value |
 |------------------------------------------------------------------|-----------|-----|
-|`prob (float, optional)`| Probability of flipping the input| `.5`  |
+|`prob (float)`| Probability of flipping the input| `.5`  |
 
 
 ## `ReduceDim`
+
 Use PCA to reduce the dimension of input image(s).
 
-| Parameter Name                                                               | Description                                          | Default Value  |
+| Parameter Name (Parameter Type)                                                               | Description                                          | Default Value  |
 |-------------------------------------------------------------------|------------------------------------------------------|------|
 |`joblib_path (str)`| Path of *.joblib file of PCA                         |      |
-|`apply_to_tar (bool, optional)` | Whether to apply transformation to the target image. | `True` |
+|`apply_to_tar (bool)` | Whether to apply transformation to the target image | `True` |
 
 
 ## `Resize`
+
 Resize input.
 
-    - If `target_size` is an int, resize the image(s) to (`target_size`, `target_size`).
-    - If `target_size` is a list or tuple, resize the image(s) to `target_size`.
-    Attention: If `interp` is 'RANDOM', the interpolation method will be chosen randomly.
+- If `target_size` is an int, resize the image(s) to `target_size` x `target_size`.
+- If `target_size` is a list or tuple, resize the image(s) to `target_size`.
+
+Attention: If `interp` is `'RANDOM'`, the interpolation method will be chosen randomly.
 
-| Parameter Name     | Description                                                                                                                                                          | Default Value      |
+| Parameter Name (Parameter Type)     | Description                                                                                                                                                          | Default Value      |
 |--------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
-| `target_size (int \| list[int] \| tuple[int])` | Target size. If it is an integer, the target height and width will be both set to `target_size`. Otherwise,  `target_size` represents [target height, target width]. |          |
-| `interp (str, optional)` | Interpolation method for resizing image(s). One of {`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}.                                         | `'LINEAR'` |
-| `keep_ratio (bool, optional)` | If `True`, the scaling factor of width and height will be set to same value, and height/width of the resized image will be not  greater than the target width/height. | `False`    |
+| `target_size (int \| list[int] \| tuple[int])` | Target size. If it is an integer, the target height and width will be both set to `target_size`. Otherwise,  `target_size` represents [target height, target width] |          |
+| `interp (str)` | Interpolation method for resizing image(s). One of {`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}                                         | `'LINEAR'` |
+| `keep_ratio (bool)` | If `True`, the scaling factor of width and height will be set to same value, and height/width of the resized image will be no greater than the target width/height | `False`    |
+
 
 ## `ResizeByLong`
-Resize the input image, keeping the aspect ratio unchanged (calculate the scaling factor based on the long side).
 
-    Attention: If `interp` is 'RANDOM', the interpolation method will be chosen randomly.
+Resize the input image, keeping the aspect ratio unchanged (calculate the scaling factor based on the long side).
 
+Attention: If `interp` is `'RANDOM'`, the interpolation method will be chosen randomly.
 
-| Parameter Name                                        | Description        | Default Value      |
+| Parameter Name (Parameter Type)                                        | Description        | Default Value      |
 |--------------------------------------------|-----------|----------|
-| `long_size (int)`|The size of the target on the longer side of the image.|          |
-| `interp (str, optional)`                    | Interpolation method for resizing image(s). One of {`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}.  | `'LINEAR'` |
+| `long_size (int)`|The size of the target on the longer side of the image|          |
+| `interp (str)`                    | Interpolation method for resizing image(s). One of {`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}  | `'LINEAR'` |
+
 
 ## `ResizeByShort`
-Resize input while keeping the aspect ratio.
 
-    Attention: If `interp` is 'RANDOM', the interpolation method will be chosen randomly.
+Resize input while keeping the aspect ratio.
 
+Attention: If `interp` is `'RANDOM'`, the interpolation method will be chosen randomly.
 
-| Parameter Name              | Description                                                                                      | Default Value      |
+| Parameter Name (Parameter Type)              | Description                                                                                      | Default Value      |
 |------------------|--------------------------------------------------------------------------------------------------|----------|
-| `short_size (int)` | Target size of the shorter side of the image(s).                                                 |          |
-| `mamax_size (int, optional)` | Upper bound of longer side of the image(s). If `max_size` is -1, no upper bound will be applied. | `-1`       |
-| `interp (str, optional)`  | Interpolation method for resizing image(s). One of {`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}.          | `'LINEAR'` |
+| `short_size (int)` | Target size of the shorter side of the image(s)                                                 |          |
+| `max_size (int)` | Upper bound of longer side of the image(s). If `max_size` is -1, no upper bound will be applied | `-1`       |
+| `interp (str)`  | Interpolation method for resizing image(s). One of {`'NEAREST'`, `'LINEAR'`, `'CUBIC'`, `'AREA'`, `'LANCZOS4'`, `'RANDOM'`}          | `'LINEAR'` |
 
 
 ## `SelectBand`
+
 Select a set of bands of input image(s).
 
-| Parameter Name              | Description                                          | Default Value      |
+| Parameter Name (Parameter Type)              | Description                                          | Default Value      |
 |------------------|------------------------------------------------------|----------|
-| `band_list (list, optional)` | Bands to select (band index starts from 1).          | `[1,2,3]`  |
-| `apply_to_tar (bool, optional)`| Whether to apply transformation to the target image. | `True`     |
+| `band_list (list)` | Bands to select (band index starts from 1)          | `[1, 2, 3]`  |
+| `apply_to_tar (bool)`| Whether to apply transformation to the target image | `True`     |

+ 1 - 1
docs/intro/transforms_en.md

@@ -34,7 +34,7 @@ PaddleRS has organically integrated the data preprocessing/data augmentation (co
 
 ## Combinatorial Operator
 
-During the model training process, it is often necessary to combine a variety of data preprocessing and data augmentation strategies. PaddleRS provides `paddlers.transforms.Compose` to easily combine multiple data transformation operators so that they can be executed serially. For the specific usage of the `paddlers.transforms.Compose` please see [API Description](https://github.com/PaddlePaddle/PaddleRS/blob/develop/docs/apis/data_en.md).
+During the model training process, it is often necessary to combine a variety of data preprocessing and data augmentation strategies. PaddleRS provides `paddlers.transforms.Compose` to easily combine multiple data transformation operators so that they can be executed serially. For the specific usage of the `paddlers.transforms.Compose` please see [API Description](../apis/data_en.md).
 
 ## Operator Construction
 

+ 4 - 4
paddlers/transforms/operators.py

@@ -1286,7 +1286,7 @@ class RandomExpand(Transform):
     Args:
         upper_ratio (float, optional): Maximum ratio to which the original image
             is expanded. Defaults to 4..
-        prob (float, optional): Probability of apply expanding. Defaults to .5.
+        prob (float, optional): Probability of expanding. Defaults to .5.
         im_padding_value (list[float] | tuple[float], optional): RGB filling value
             for the image. Defaults to (127.5, 127.5, 127.5).
         label_padding_value (int, optional): Filling value for the mask.
@@ -1345,7 +1345,7 @@ class Pad(Transform):
             target_size (list[int] | tuple[int], optional): Image target size, if None, pad to
                 multiple of size_divisor. Defaults to None.
             pad_mode (int, optional): Pad mode. Currently only four modes are supported:
-                [-1, 0, 1, 2]. if -1, use specified offsets. If 0, only pad to right and bottom
+                [-1, 0, 1, 2]. if -1, use specified offsets. If 0, only pad to right and bottom.
                 If 1, pad according to center. If 2, only pad left and top. Defaults to 0.
             offsets (list[int]|None, optional): Padding offsets. Defaults to None.
             im_padding_value (list[float] | tuple[float]): RGB value of padded area.
@@ -1578,10 +1578,10 @@ class RandomDistort(Transform):
             Defaults to .5.
         hue_range (float, optional): Range of hue distortion. Defaults to .5.
         hue_prob (float, optional): Probability of hue distortion. Defaults to .5.
-        random_apply (bool, optional): Apply the transformation in random (yolo) or
+        random_apply (bool, optional): Apply the transformation in random (YOLO) or
             fixed (SSD) order. Defaults to True.
         count (int, optional): Number of distortions to apply. Defaults to 4.
-        shuffle_channel (bool, optional): Whether to swap channels randomly.
+        shuffle_channel (bool, optional): Whether to permute channels randomly.
             Defaults to False.
     """