Răsfoiți Sursa

【Hackathon No.153】Add English Documentation (#113)

lizechng 2 ani în urmă
părinte
comite
d558df7e72

+ 129 - 0
docs/CONTRIBUTING_en.md

@@ -0,0 +1,129 @@
+# PaddleRS Contribution Guide
+
+## Contribute Code
+
+This guide starts with the necessary steps to contribute code to PaddleRS, and then goes into details on self-inspection on newly added files, code style specification, and testing steps.
+
+### 1 Code Contribution Steps
+
+PaddleRS uses [git](https://git-scm.com/doc) as a version control tool and is hosted on GitHub. This means that you need to be familiar with git before contributing code. And you need to be familiar with [pull request (PR)](https://docs.github.com/cn/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests) based on the GitHub workflow.
+
+The steps to contribute code to PaddleRS are as follows:
+
+1. Fork the official PaddleRS repository on GitHub, clone the code locally, and pull the latest version of the develop branch.
+2. Write code according to [Dev Guide](dev/dev_guide.md) (it is recommended to develop on a new feature branch).
+3. Install pre-commit hooks to perform code style checks before each commit. Refer to [Code style specification](#3-Code style specification).
+4. Write unit tests for the new code and make sure all the tests are successful. Refer to [Test related steps](#4-Test related steps)。
+5. Create a new PR for your branch and ensure that the CLA protocol is signed and the CI/CE passes. After that, a PaddleRS team member will review the code you contributed.
+6. Modify the code according to the review and resubmit it until PR is merged or closed.
+
+If you contribute code that uses a third-party library that PaddleRS does not currently rely on, please explain when you submit your PR. Also, you should explain why this third-party library should be used.
+
+### 2 Self-Check on Added Files
+
+Unlike code style specifications, pre-commit hooks do not enforce the rules described in this section, so it is up to the developer to check.
+
+#### 2.1 Copyright Information
+
+Copyright information must be added to each new file in PaddleRS, as shown below:
+
+```python
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+```
+
+*Note: The year in copyright information needs to be rewritten according to the current natural year.*
+
+#### 2.2 The Order of Module Import
+
+All global import statements must be at the beginning of the module, right after the copyright information. Import packages or modules in the following order:
+
+1. Python standard libraries;
+2. Third-party libraries installed through package managers such as `pip`(note that `paddle` is a third-party library, but `paddlers` is not itself a third-party library);
+3. `paddlers` and `paddlers` subpackages and modules.
+
+There should be a blank line between import statements of different types. The file should not contain import statements for unused packages or modules. In addition, if the length of the imported statements varies greatly, you are advised to arrange them in ascending order. An example is shown below:
+
+```python
+import os
+
+import numpy as np
+import paddle.nn as nn
+import paddle.nn.functional as F
+
+import paddlers.transforms as T
+from paddlers.transforms import DecodeImg
+```
+
+### 3 Code Style Specification
+
+PaddleRS' code style specification is basically the same as the [Google Python Style specification](https://zh-google-styleguide.readthedocs.io/en/latest/google-python-styleguide/python_style_rules/), except that PaddleRS does not enforce the type annotation.(i.e. type hints, refer to [PEP 483](https://peps.python.org/pep-0483/) and [PEP 484](https://peps.python.org/pep-0484/)). Some of the important code style specifications are:
+
+- Blank line: Two empty lines between top-level definitions (such as top-level function or class definitions). There is a blank line between the definitions of different methods within the class, and between the class name and the first method definition. Inside the function you need to be careful to add a blank line where there is a logical break.
+
+- Line length: No more than 80 characters per line (either code or comment), especially for lines in a docstring.
+
+- Parentheses: Parentheses can be used for line concatenation, but do not use unnecessary parentheses in `if` conditions.
+
+- Exceptions: Throw and catch exceptions with as specific an exception type as possible, and almost never use the base class `Exception` (unless the purpose is to catch any exception of any type).
+
+- Comments: All comments are written in English. All apis provided to users must have docstrings added and have at least two sections, "API Function Description" and "API Parameters". Surround a docstring with three double quotes `"""`. See the [Code Comment Specification](dev/docstring.md) for details on docstring writing.
+
+- Naming: Variable names of different types apply the following case rules: module name: `module_name`; package name: `package_name`; class name: `ClassName`; method name: `method_name`; function name: `function_name`; name of a global constant (a variable whose value does not change during the running of the program) : `GLOBAL_CONSTANT_NAME`; global variable name: `global_var_name`; instance name: `instance_var_name`; function parameter name: `function_param_name`; local variable name: `local_var_name`.
+
+### 4 Test Related Steps
+
+To ensure code quality, you need to write unit test scripts for the new functional components. Please read the steps for writing a single test according to your contribution.
+
+#### 4.1 model Single Test
+
+1. Find the test case definition file corresponding to the task of the model in `tests/rs_models/`, for example, the change detection task corresponding to `tests/rs_models/test_cd_models.py`.
+2. Define a test class for the new model that inherits from `Test{task name}Model` and sets its `MODEL_CLASS` property to the new model, following the example already in the file.
+3. Override the new test class's `test_specs()` method. This method sets `self.specs` to a list with each item in the list as a dictionary whose key-value pairs are used as configuration items for the constructor model. That is, each item in `self.specs` corresponds to a set of test cases, each of which tests the model constructed with a particular parameter.
+
+#### 4.2 Data Preprocessing/Data Augmentation Single Test
+
+- If you write the data preprocessing/augmentation operator (inherited from `paddlers.transforms.operators.Transform`), all the necessary to construct the operator input parameters have default values, and the operator can handle any task, arbitrary band data, You need to add a new method to the `TestTransform` class in the `tests/transforms/test_operators.py` modulated on the `test_Resize()` or `test_RandomFlipOrRotate()` methods.
+- If you write an operator that only supports processing for a specific task or requires the number of bands in the input data, bind the operator `_InputFilter` in the `OP2FILTER` global variable after writing the test logic.
+- If you are writing a data preprocessing/data augmentation function(i.e. `paddlers/transforms/functions.py`), add a test class in `tests/transforms/test_functions.py` mimicking the existing example.
+
+#### 4.3 Tool Single Test
+
+1. Create a new file in the `tests/tools/` directory and name it `test_{tool name}.py`.
+2. Write the test case in the newly created script.
+
+#### 4.4 Execute the Test
+
+After adding the test cases, you need to execute all of the tests in their entirety (because the new code may affect the original code of the project and make some of the functionality not work properly). Enter the following command:
+
+```bash
+cd tests
+bash run_tests.sh
+```
+
+This process can be time-consuming and requires patience. If some of the test cases do not pass, modify them based on the error message until all of them pass.
+
+Run the following script to obtain coverage information:
+
+```bash
+bash check_coverage.sh
+```
+
+#### 4.5 TIPC
+
+If your contribution includes TIPC, please submit your PR with a log indicating successful execution of the TIPC basic training chain.
+
+## Contribute an Example
+
+tbd

+ 1 - 0
docs/README_en.md

@@ -0,0 +1 @@
+# PaddleRS Document

+ 205 - 0
docs/apis/data_en.md

@@ -0,0 +1,205 @@
+# Data Related API Description
+
+## Dataset
+
+In PaddleRS, all datasets inherit from the parent class [`BaseDataset`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/datasets/base.py).
+
+### Change Detection Dataset `CDDataset`
+
+`CDDataset` is defined in: https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/datasets/cd_dataset.py
+
+The initialization parameter list is as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`data_dir`|`str`|Directory that stores the dataset.||
+|`file_list`|`str`|File list path. File list is a text file, in which each line contains the path infomation of one sample. The specific requirements of `CDDataset` on the file list are listed below.||
+|`transforms`|`paddlers.transforms.Compose`|Data transformation operators applied to input data.||
+|`label_list`|`str` \| `None`|Label list file. label list is a text file, in which each line contains the name of class.|`None`|
+|`num_workers`|`int` \| `str`|Number of auxiliary processes used when loading data. If it is set to `'auto'`, use the following rules to determine the number of processes to use: when the number of CPU cores is greater than 16, 8 data read auxiliary processes are used; otherwise, the number of auxiliary processes is set to half the counts of CPU cores.|`'auto'`|
+|`shuffle`|`bool`|Whether to randomly shuffle the samples in the dataset.|`False`|
+|`with_seg_labels`|`bool`|Specify this option as `True` when the dataset contains segmentation labels for each phase.|`False`|
+|`binarize_labels`|`bool`|If it is `True`, the change labels (and the segmentation label) are binarized after all data transformation operators except `Arrange` are applied. For example, binarize labels valued in {0, 255} to {0, 1}.|`False`|
+
+The requirements of `CDDataset` for the file list are as follows:
+
+- If `with_seg_labels` is `False`, each line in the file list should contain three space-separated items representing, in turn, the path to the image of the first temporal phase, path to the image of the second temporal phase, and the path to the change label. Each given path should be the path relative to `data_dir`.
+- If `with_seg_labels` is `True`, each line in the file list should contain five space-separated items, the first three of which have the same meaning as `with_seg_labels` is `False`, and the last two represent the path of the segmentation labels for the first and second phase images (also in relative paths `data_dir`).
+
+### Scenario Classification Dataset `ClasDataset`
+
+`ClasDataset` is defined in: https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/datasets/clas_dataset.py
+
+The initialization parameter list is as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`data_dir`|`str`|Directory that stores the dataset.||
+|`file_list`|`str`|File list path. file list is a text file, in which each line contains the path infomation of one sample.The specific requirements of `ClasDataset` on the file list are listed below.||
+|`transforms`|`paddlers.transforms.Compose`|Data transformation operators applied to input data.||
+|`label_list`|`str` \| `None`|Label list file. label list is a text file, in which each line contains the name of class.|`None`|
+|`num_workers`|`int` \| `str`|Number of auxiliary processes used when loading data. If it is set to `'auto'`, use the following rules to determine the number of processes to use: when the number of CPU cores is greater than 16, 8 data read auxiliary processes are used; otherwise, the number of auxiliary processes is set to half the counts of CPU cores.|`'auto'`|
+|`shuffle`|`bool`|Whether to randomly shuffle the samples in the dataset.|`False`|
+
+The requirements of `ClasDataset` for the file list are as follows:
+
+- Each line in the file list should contain two space-separated items representing, in turn, the path of input image relative to `data_dir` and the category ID of the image (which can be parsed as an integer value).
+
+### COCO Format Object Detection Dataset `COCODetDataset`
+
+`COCODetDataset` is defined in: https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/datasets/coco.py
+
+The initialization parameter list is as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`data_dir`|`str`|Directory that stores the dataset.||
+|`image_dir`|`str`|Directory of input images.||
+|`ann_path`|`str`|[COCO Format](https://cocodataset.org/#home)label file path.||
+|`transforms`|`paddlers.transforms.Compose`|Data transformation operators applied to input data.||
+|`label_list`|`str` \| `None`|Label list file. label list is a text file, in which each line contains the name of class.|`None`|
+|`num_workers`|`int` \| `str`|Number of auxiliary processes used when loading data. If it is set to `'auto'`, use the following rules to determine the number of processes to use: when the number of CPU cores is greater than 16, 8 data read auxiliary processes are used; otherwise, the number of auxiliary processes is set to half the counts of CPU cores.|`'auto'`|
+|`shuffle`|`bool`|Whether to randomly shuffle the samples in the dataset.|`False`|
+|`allow_empty`|`bool`|Whether to add negative samples to the dataset.|`False`|
+|`empty_ratio`|`float`|Negative sample ratio. Take effect only if `allow_empty` is `True`. If `empty_ratio` is negative or greater than or equal to 1, all negative samples generated are retained.|`1.0`|
+
+### VOC Format Object Detection Dataset `VOCDetDataset`
+
+`VOCDetDataset` is defined in: https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/datasets/voc.py
+
+The initialization parameter list is as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`data_dir`|`str`|Directory that stores the dataset. ||
+|`file_list`|`str`|File list path. File list is a text file, in which each line contains the path infomation of one sample.The specific requirements of `VOCDetDataset` on the file list are listed below.||
+|`transforms`|`paddlers.transforms.Compose`|Data transformation operators applied to input data. ||
+|`label_list`|`str` \| `None`|Label list file. label list is a text file, in which each line contains the name of class. |`None`|
+|`num_workers`|`int` \| `str`|Number of auxiliary processes used when loading data. If it is set to `'auto'`, use the following rules to determine the number of processes to use: when the number of CPU cores is greater than 16, 8 data read auxiliary processes are used; otherwise, the number of auxiliary processes is set to half the counts of CPU cores.|`'auto'`|
+|`shuffle`|`bool`|Whether to randomly shuffle the samples in the dataset.|`False`|
+|`allow_empty`|`bool`|Whether to add negative samples to the dataset.|`False`|
+|`empty_ratio`|`float`|Negative sample ratio. Take effect only if `allow_empty` is `True`. If `empty_ratio` is negative or greater than or equal to 1, all negative samples generated are retained.|`1.0`|
+
+The requirements of `VOCDetDataset` for the file list are as follows:
+
+- Each line in the file list should contain two space-separated items representing, in turn, the path of input image relative to `data_dir` and the path of [Pascal VOC Format](http://host.robots.ox.ac.uk/pascal/VOC/)label file relative to `data_dir`.
+
+### Image Restoration Dataset `ResDataset`
+
+`ResDataset` is defined in: https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/datasets/res_dataset.py
+
+The initialization parameter list is as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`data_dir`|`str`|Directory that stores the dataset.||
+|`file_list`|`str`|File list path. file list is a text file, in which each line contains the path infomation of one sample.The specific requirements of `ResDataset` on the file list are listed below.||
+|`transforms`|`paddlers.transforms.Compose`|Data transformation operators applied to input data.||
+|`num_workers`|`int` \| `str`|Number of auxiliary processes used when loading data. If it is set to `'auto'`, use the following rules to determine the number of processes to use: when the number of CPU cores is greater than 16, 8 data read auxiliary processes are used; otherwise, the number of auxiliary processes is set to half the counts of CPU cores.|`'auto'`|
+|`shuffle`|`bool`|Whether to randomly shuffle the samples in the dataset.|`False`|
+|`sr_factor`|`int` \| `None`|For super resolution reconstruction task, this is the scaling factor. For other tasks, please specify `sr_factor` as `None`.|`None`|
+
+The requirements of `ResDataset` for the file list are as follows:
+
+- Each line in the file list should contain two space-separated items representing, in turn, representing the path of the input image (such as a low-resolution image in a super-resolution reconstruction task) relative to the `data_dir` and the path of the target image (such as a high-resolution image in a super-resolution reconstruction task) relative to the `data_dir`.
+
+### Image Segmentation Dataset `SegDataset`
+
+`SegDataset` is defined in: https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/datasets/seg_dataset.py
+
+The initialization parameter list is as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`data_dir`|`str`|Directory that stores the dataset.||
+|`file_list`|`str`|File list path. file list is a text file, in which each line contains the path infomation of one sample.The specific requirements of `SegDataset` on the file list are listed below.||
+|`transforms`|`paddlers.transforms.Compose`|Data transformation operators applied to input data.||
+|`label_list`|`str` \| `None`|Label list file. label list is a text file, in which each line contains the name of class.|`None`|
+|`num_workers`|`int` \| `str`|Number of auxiliary processes used when loading data. If it is set to `'auto'`, use the following rules to determine the number of processes to use: when the number of CPU cores is greater than 16, 8 data read auxiliary processes are used; otherwise, the number of auxiliary processes is set to half the counts of CPU cores.|`'auto'`|
+|`shuffle`|`bool`|Whether to randomly shuffle the samples in the dataset.|`False`|
+
+The requirements of `SegDataset` for the file list are as follows:
+
+- Each line in the file list should contain two space-separated items representing, in turn, the path of input image relative to `data_dir` and the path of the segmentation label relative to `data_dir`.
+
+## API of Data Reading
+
+Remote sensing images come from various sources and their data formats are very complicated. PaddleRS provides a unified interface for reading remote sensing images of different types and formats. At present, PaddleRS can read common file formats such as .png, .jpg, .bmp and .npy, as well as handle GeoTiff, img and other image formats commonly used in remote sensing.
+
+Depending on the practical demands, the user can choose `paddlers.transforms.decode_image()` or `paddlers.transforms.DecodeImg` to read data. `DecodeImg` is one of [Data transformation operators](#Data transformation operators), can be combined with other operators. `decode_image` is the encapsulation of `DecodeImg` operator, which is convenient use in the way of function calls.
+
+The parameter list of `decode_image()` function is as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`im_path`|`str`|Path of input image.||
+|`to_rgb`|`bool`|If `True`, the conversion of BGR to RGB format is performed. This parameter is not used and may be removed in the future. Do not use it if possible.|`True`|
+|`to_uint8`|`bool`|If `True`, the image data read is quantized and converted to uint8.|`True`|
+|`decode_bgr`|`bool`|If `True`, automatically parses non-geo format images (such as jpeg images) into BGR format.|`True`|
+|`decode_sar`|`bool`|If `True`, single-channel geo-format images (such as GeoTiff images) are automatically parsed as SAR images.|`True`|
+|`read_geo_info`|`bool`|If `True`, the geographic information is read from the image.|`False`|
+|`use_stretch`|`bool`|Whether to apply a linear stretch to image image brightness (with 2% max and min values removed). Take effect only if `to_uint8` is `True`.|`False`|
+|`read_raw`|`bool`|If `True`, it is equivalent to specifying `to_rgb` as `True` and `to_uint8` as `False`, and this parameter has a higher priority than the above.|`False`|
+
+The return format is as follows:
+
+- If `read_geo_info` is `False`, the image data ([h, w, c] arrangement) is returned in the format of np.ndarray.
+- If `read_geo_info` is `True`, return a tuple consisting of two elements. The first element is the image data, and the second element is a dictionary containing the geographic information of the image, such as the geotransform information and geographic projection information.
+
+## Data Transformation Operator
+
+In PaddleRS a series of classes are defined that, when instantiated, perform certain data preprocessing or data augmentation operations by calling the `__call__` method. PaddleRS calls these classes data preprocessing/data augmentation operators, and collectively **Data Transform Operators**. All data transformation operators inherit from the parent class[`Transform`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/transforms/operators.py).
+
+### `Transform`
+
+The `__call__` method of the `Transform` object takes a unique argument `sample`. `sample` must be a dictionary or a sequence of dictionaries. When `sample` is a sequence, perform data transformations for each dictionary in `sample` and return the results sequentially stored in a Python build-in list; when `sample` is a dictionary, the `Transform` object extracts input from some of its key-value pairs (these keys are called "input keys"), performs the transformation, and writes the results as key-value pairs into `sample`(these keys are called "output keys"). It should be noted that many of the `Transform` objects in PaddleRS overwrite key-value pairs, that is, there is an intersection between the input key and the output key. The common keys in `sample` and their meanings are as follows:
+
+|Key Name|Description|
+|----|----|
+|`'image'`|Image path or data. For change detection task, it refers to the first phase image data.|
+|`'image2'`|Second phase image data in change detection task.|
+|`'image_t1'`|First phase image path in change detection task.|
+|`'image_t2'`|Second phase image path in change detection task.|
+|`'mask'`|Ground-truth label path or data in image segmentation/change detection task.|
+|`'aux_masks'`|Auxiliary label path or data in image segmentation/change detection tasks.|
+|`'gt_bbox'`|Detection box labeling data in object detection task.|
+|`'gt_poly'`|Polygon labeling data in object detection task.|
+|`'target'`|Target image path or data in image restoration task.|
+
+### Combined Data Transformation Operator
+
+Use `paddlers.transforms.Compose` to combine a set of data transformation operators. `Compose` receives a list input when constructed. When you call `Compose`, it serially execute each data transform operators in the list. The following is an example:
+
+```python
+# Compose a variety of transformations using Compose.
+# The transformations contained in Compose will be executed sequentially in sequence
+train_transforms = T.Compose([
+    # Read Image
+    T.DecodeImg(),
+    # Scale the image to 512x512
+    T.Resize(target_size=512),
+    # Perform a random horizontal flip with a 50% probability
+    T.RandomHorizontalFlip(prob=0.5),
+    # Normalize data to [-1,1]
+    T.Normalize(
+        mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
+    # Select and organize the information that needs to be used later
+    T.ArrangeSegmenter('train')
+])
+```
+
+Generally, in the list of data transform operators accepted by a `Compose` object, the first element is `paddlers.Transforms.DecodeImg` object, used to read image data; the last element is [`Arrange` Operator](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/transforms/operators.py, used to extract and arrange information from the `sample` dictionary.
+
+For the validation dataset of image segmentation task and change detection task, the `ReloadMask` operator can be inserted before the `Arrange` operator to reload the ground-truth label. The following is an example:
+
+```python
+eval_transforms = T.Compose([
+    T.DecodeImg(),
+    T.Resize(target_size=512),
+    T.Normalize(
+        mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
+    # Reload label
+    T.ReloadMask(),
+    T.ArrangeSegmenter('eval')
+])
+```

+ 235 - 0
docs/apis/infer_en.md

@@ -0,0 +1,235 @@
+# PaddleRS Inference API Description
+
+The dynamic graph inference and static graph inference of PaddleRS are provided by the trainer ([`BaseModel`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/tasks/base.py) and subclasses) and **predictor** (`paddlers.deploy.Predictor`) respectively.
+
+## Dynamic Graph Inference API
+
+### Whole Image Inference
+
+#### `BaseChangeDetector.predict()`
+
+Interface:
+
+```python
+def predict(self, img_file, transforms=None):
+```
+
+Input parameters:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`img_file`|`list[tuple]` \| `tuple[str \| np.ndarray]`|Input image pair data (in NumPy array form) or input image pair path. If only one image pair is predicted, a tuple is used to sequentially contain the first phase image data/path and the second phase image data/path. If a group of image pairs need to be predicted at once, the list contains the data or paths of those image pairs (one tuple from the list for each image pair).||
+|`transforms`|`paddlers.transforms.Compose` \| `None`|Apply data transformation operators to input data. If `None`, the data transformation operators of trainer in the validation phase is used.|`None`|
+
+Return format:
+
+If `img_file` is a tuple, return a dictionary containing the following key-value pairs:
+
+```
+{"label_map": category labels of model output (arranged in [h, w] format), "score_map": class probabilities of model output (arranged in format [h, w, c])}
+```
+
+If `img_file` is a list, return an list as long as `img_file`, where each item is a dictionary (key-value pairs shown above), corresponding in order to each element in `img_file`.
+
+#### `BaseClassifier.predict()`
+
+Interface:
+
+```python
+def predict(self, img_file, transforms=None):
+```
+
+Input parameters:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`img_file`|`list[str\|np.ndarray]` \| `str` \| `np.ndarray`|input image data (in the form of NumPy array) or input image path. If a group of images need to be predicted at once, the list contains the data or paths for those images (one element in the list for each image).||
+|`transforms`|`paddlers.transforms.Compose` \| `None`|Apply data transformation operators to input data. If `None`, the data transformation operators of trainer in the validation phase is used.|`None`|
+
+Return format:
+
+If `img_file` is a string or NumPy array, return a dictionary containing the following key-value pairs:
+
+```
+{"class_ids_map": output category label,
+ "scores_map": output category probability,
+ "label_names_map": output category name}
+```
+
+If `img_file` is a list, return a list as long as `img_file`, where each item is a dictionary (key-value pairs shown above), corresponding in order to each element in `img_file`.
+
+#### `BaseDetector.predict()`
+
+Interface:
+
+```python
+def predict(self, img_file, transforms=None):
+```
+
+Input parameters:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`img_file`|`list[str\|np.ndarray]` \| `str` \| `np.ndarray`|input image data (in the form of NumPy array) or input image path. If a group of images need to be predicted at once, the list contains the data or paths for those images (one element in the list for each image).||
+|`transforms`|`paddlers.transforms.Compose` \| `None`|Apply data transformation operators to input data. If `None`, the data transformation operators of trainer in the validation phase is used.|`None`|
+
+Return format:
+
+If `img_file` is a string or NumPy array, return a list with a predicted target box for each element in the list. The elements in the list are dictionaries containing the following key-value pairs:
+
+```
+{"category_id": Category ID,
+ "category": Category name,
+ "bbox": Target box position information, including the horizontal and vertical coordinates of the upper left corner of the target box and the width and length of the target box,  
+ "score": Category confidence,
+ "mask": [RLE Format](https://baike.baidu.com/item/rle/366352) mask, only instance segmentation model prediction results contain this key-value pair}
+```
+
+If `img_file` is a list, return a list as long as `img_file`, where each item is a list of dictionaries (key-value pairs shown above), corresponding in order to each element in `img_file`.
+
+#### `BaseRestorer.predict()`
+
+Interface:
+
+```python
+def predict(self, img_file, transforms=None):
+```
+
+Input parameters:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`img_file`|`list[str\|np.ndarray]` \| `str` \| `np.ndarray`|input image data (in the form of NumPy array) or input image path. If a group of images need to be predicted at once, the list contains the data or paths for those images (one element in the list for each image).||
+|`transforms`|`paddlers.transforms.Compose` \| `None`|Apply data transformation operators to input data. If `None`, the data transformation operators of trainer in the validation phase is used.|`None`|
+
+Return format:
+
+If `img_file` is a string or NumPy array, return a dictionary containing the following key-value pairs:
+
+```
+{"res_map": restored or reconstructed images of model output (arranged in format [h, w, c])}
+```
+
+If `img_file` is a list, return a list as long as `img_file`, where each item is a dictionary (key-value pairs shown above), corresponding in order to each element in `img_file`.
+
+#### `BaseSegmenter.predict()`
+
+Interface:
+
+```python
+def predict(self, img_file, transforms=None):
+```
+
+Input parameters:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`img_file`|`list[str\|np.ndarray]` \| `str` \| `np.ndarray`|input image data (in the form of NumPy array) or input image path. If a group of images need to be predicted at once, the list contains the data or paths for those images (one element in the list for each image).||
+|`transforms`|`paddlers.transforms.Compose` \| `None`|Apply data transformation operators to input data. If `None`, the data transformation operators of trainer in the validation phase is used.|`None`|
+
+Return format:
+
+If `img_file` is a string or NumPy array, return a dictionary containing the following key-value pairs:
+
+```
+{"label_map": output category labels (arranged in [h, w] format), "score_map": category probabilities of model output (arranged in format [h, w, c])}
+```
+
+If `img_file` is a list, return a list as long as `img_file`, where each item is a dictionary (key-value pairs shown above), corresponding in order to each element in `img_file`.
+
+### Sliding Window Inference
+
+Considering the large-scale nature of remote sensing image, PaddleRS provides sliding window inference support for some tasks. The sliding window inference feature of PaddleRS has the following characteristics:
+
+1. In order to solve the problem of insufficient memory caused by reading the whole large image at once, PaddleRS has adopted the lazy loading memory technology, which only read and processed the image blocks in one window at a time.
+2. Users can customize the size and stride of the sliding window. Meanwhile, PaddleRS supports sliding window overlapping. For the overlapping parts between windows, PaddleRS will automatically fuse the model's predicted results.
+3. The inference results can be saved in GeoTiff format, and the reading and writing of geographic transformation information and geographic projection information is supported.
+
+Currently, the image segmentation trainer ([`BaseSegmenter`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/tasks/segmenter.py) and subclasses) and change detection trainer ([`BaseChangeDetector`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/tasks/change_detector.py) and subclasses)have dynamic graph sliding window inference API. Take the API of image segmentation task as an example, the explanation is as follows:
+
+Interface:
+
+```python
+def slider_predict(self,
+                   img_file,
+                   save_dir,
+                   block_size,
+                   overlap=36,
+                   transforms=None,
+                   invalid_value=255,
+                   merge_strategy='keep_last',
+                   batch_size=1,
+                   eager_load=False,
+                   quiet=False):
+```
+
+Input parameter list:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`img_file`|`str`|Input image path.||
+|`save_dir`|`str`|Predicted results output path.||
+|`block_size`|`list[int]` \| `tuple[int]` \| `int`|Size of the sliding window (specifying the width, height in a list or tuple, or the same width and height in an integer).||
+|`overlap`|`list[int]` \| `tuple[int]` \| `int`|Sliding step size of the sliding window (specifying the width, height in a list or tuple, or the same width and height in an integer).|`36`|
+|`transforms`|`paddlers.transforms.Compose` \| `None`|Apply data transformation operators to input data. If `None`, the data transformation operators of trainer in the validation phase is used.|`None`|
+|`invalid_value`|`int`|Value used to mark invalid pixels in the output image.|`255`|
+|`merge_strategy`|`str`|Strategies used to merge sliding window overlapping areas.`'keep_first'` represents the prediction category that retains the most advanced window in the traversal order (left to right, top to bottom, column first); `'keep_last'` stands for keeping the prediction category of the last window in the traversal order;`'accum'` means to calculate the final prediction category by summing the prediction probabilities given by each window in the overlapping area. It should be noted that when dense inference with large `overlap` is carried out for large size images, the use of `'accum'` strategy may lead to longer inference time, but generally it can achieve better performance at the window boundary.|`'keep_last'`|
+|`batch_size`|`int`|Mini-batch size used for prediction.|`1`|
+|`eager_load`|`bool`|If `True`, instead of using lazy memory loading, the entire image is loaded into memory at once at the beginning of the prediction.|`False`|
+|`quiet`|`bool`|If `True`, the predicted progress is not displayed.|`False`|
+
+The sliding window inference API of the change detection task is similar to that of the image segmentation task, but it should be noted that the information stored in the output results, such as geographic transformation and projection, is subject to the information read from the first phase image, and the file name stored in the sliding window inference results is the same as that of the first phase image file.
+
+## Static Graph Inference API
+
+### Python API
+
+[Export the model to a deployment format](https://github.com/PaddlePaddle/PaddleRS/blob/develop/deploy/export/README.md)or execution model quantization, PaddleRS provide `paddlers.deploy.Predictor` used to load the deployment model or quantization model and performing inference based on [Paddle Inference](https://www.paddlepaddle.org.cn/tutorials/projectdetail/3952715).
+
+#### Initialize `Predictor`
+
+`Predictor.__init__()` takes the following arguments:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`model_dir`|`str`|Model path (must be an exported deployed or quantized model).||
+|`use_gpu`|`bool`|Whether to use GPU.|`False`|
+|`gpu_id`|`int`|ID of the GPU used.|`0`|
+|`cpu_thread_num`|`int`|Number of threads when inference is performed using CPUs.|`1`|
+|`use_mkl`|`bool`|Whether to use MCL-DNN compute library (This option takes effect only when inference is performed using CPUs).|`False`|
+|`mkl_thread_num`|`int`|Count the threads of MKL-DNN.|`4`|
+|`use_trt`|`bool`|Whether to use TensorRT.|`False`|
+|`use_glog`|`bool`|Whether to enable glog logs.|`False`|
+|`memory_optimize`|`bool`|Whether to enable memory optimization.|`True`|
+|`max_trt_batch_size`|`int`|Maximum batch size configured when TensorRT is used.|`1`|
+|`trt_precision_mode`|`str`|Precision to be used when using TensorRT, with the optional values of `'float32'` or `'float16'`.|`'float32'`|
+
+#### `Predictor.predict()`
+
+Interface:
+
+```python
+def predict(self,
+            img_file,
+            topk=1,
+            transforms=None,
+            warmup_iters=0,
+            repeats=1):
+```
+
+Input parameter list:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`img_file`|`list[str\|tuple\|np.ndarray]` \| `str` \| `tuple` \| `np.ndarray`|For scene classification, object detection, image restoration and image segmentation tasks, this parameter can be a single image path, or a decoded image data in [h, w, c] with a float32 type (expressed as NumPy array), or a list of image paths or np.ndarray objects. For the change detection task, the parameter can be a two-tuple of image path (representing the two time phase image paths respectively), or a two-tuple composed of two decoded images, or a list composed of one of the above two two-tuples.||
+|`topk`|`int`|It is used in scenario classification model prediction, indicating that the category with the top `topk` in the output probability of the model is selected as the final result.|`1`|
+|`transforms`|`paddlers.transforms.Compose`\|`None`|Apply data transformation operators to input data. If `None`, the operators read from 'model.yml' is used.|`None`|
+|`warmup_iters`|`int`|Number of warm-up rounds used to evaluate model inference and pre- and post-processing speed. If it is greater than 1, the `warmup_iters` inference is repeated in advance before being formally predicted and its speed assessed.|`0`|
+|`repeats`|`int`|Number of repetitions used to assess model reasoning and pre- and post-processing speed. If it is greater than 1, repeats the prediction and averages the time.|`1`|
+|`quiet`|`bool`|If `True`, no timing information is printed.|`False`|
+
+`Predictor.predict()`returns exactly the same format as the graph inference API. For details, refer to[Dynamic Graph Inference API](#Dynamic Graph Inference API).
+
+### `Predictor.slider_predict()`
+
+Implements the sliding window inference function. It is used in the same way as `BaseSegmenter` and `slider_predict()` of `BaseChangeDetector`.

+ 1 - 1
docs/apis/train.md

@@ -174,7 +174,7 @@ def train(self,
 |`warmup_start_lr`|`int`|默认优化器warm-up阶段使用的初始学习率。|`0`|
 |`lr_decay_epochs`|`list` \| `tuple`|默认优化器学习率衰减的milestones,以epoch计。即,在第几个epoch执行学习率的衰减。|`(216, 243)`|
 |`lr_decay_gamma`|`float`|学习率衰减系数,适用于默认优化器。|`0.1`|
-|`metric`|`str` \| `None`|评价指标,可以为`'VOC'`、`COCO`或`None`。若为`Nnoe`,则根据数据集格式自动确定使用的评价指标。|`None`|
+|`metric`|`str` \| `None`|评价指标,可以为`'VOC'`、`COCO`或`None`。若为`None`,则根据数据集格式自动确定使用的评价指标。|`None`|
 |`use_ema`|`bool`|是否启用[指数滑动平均策略](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/models/ppdet/optimizer.py)更新模型权重参数。|`False`|
 |`early_stop`|`bool`|训练过程是否启用早停策略。|`False`|
 |`early_stop_patience`|`int`|启用早停策略时的`patience`参数(参见[`EarlyStop`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/utils/utils.py))。|`5`|

+ 420 - 0
docs/apis/train_en.md

@@ -0,0 +1,420 @@
+# PaddleRS Training API Description
+
+**Trainer** encapsulates model training, validation, quantization, and dynamic graph inference, defined in files of `paddlers/tasks/` directory. For user convenience, PaddleRS provides trainers that inherits from the parent class [`BaseModel`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/tasks/base.py) for all supported models, and provides several apis externally. The types of trainers corresponding to change detection, scene classification, target detection, image restoration and image segmentation tasks are respectively `BaseChangeDetector`、`BaseClassifier`、`BaseDetector`、`BaseRestorer` and `BaseSegmenter`。This document describes the initialization function of the trainer and `train()`、`evaluate()` API。
+
+## Initialize the Trainer
+
+All trainers support default parameter construction (that is, no parameters are passed in when the object is constructed), in which case the constructed trainer object applies to three-channel RGB data.
+
+### Initialize `BaseChangeDetector` Subclass Object
+
+- The `num_classes`、`use_mixed_loss` and `in_channels` parameters are generally supported, indicating the number of model output categories, whether to use preset mixing losses, and the number of input channels, respectively. Some subclasses, such as `DSIFN`, do not yet support `in_channels`.
+- `use_mixed_loss` will be deprecated in the future, so it is not recommended.
+- Specify the loss function used during model training through the `losses` parameter. `losses` needs to be a dictionary, where the values for the keys `types` and `coef` are two equal-length lists representing the loss function object (a callable object) and the weight of the loss function, respectively. For example: `losses={'types': [LossType1(), LossType2()], 'coef': [1.0, 0.5]}`. It is equivalent to calculating the following loss function in the training process: `1.0*LossType1()(logits, labels)+0.5*LossType2()(logits, labels)`, where `logits` and `labels` are model output and ground-truth labels, respectively.
+- Different subclasses support model-related input parameters. For details, you can refer to [model definitions](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/rs_models/cd) and [trainer definitions](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/tasks/change_detector.py).
+
+### Initialize `BaseClassifier` Subclass Object
+
+- The `num_classes` and `use_mixed_loss` parameters are generally supported, indicating the number of model output categories, whether to use preset mixing losses.
+- `use_mixed_loss` will be deprecated in the future, so it is not recommended.
+- Specify the loss function used during model training through the `losses` parameter. The passed argument needs to be an object of type `paddlers.models.clas_losses.CombinedLoss`.
+- Different subclasses support model-related input parameters. For details, you can refer to [model definitions](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/rs_models/clas) and [trainer definitions](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/tasks/classifier.py).
+
+### Initialize `BaseDetector` Subclass Object
+
+- Generally, the `num_classes` and `backbone` parameters can be set to indicate the number of output categories of the model and the type of backbone network used, respectively. Compared with other tasks, the trainer of object detection task supports more initialization parameters, including network structure, loss function, post-processing strategy and so on.
+- Different from tasks such as segmentation, classification and change detection, detection tasks do not support the loss function specified through the `losses` parameter. However, for some trainers such as `PPYOLO`, the loss function can be customized by `use_iou_loss` and other parameters.
+- Different subclasses support model-related input parameters. For details, you can refer to [model definitions](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/rs_models/det) and [trainer definitions](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/tasks/object_detector.py).
+
+### Initialize `BaseRestorer` Subclass Object
+
+- Generally support setting `sr_factor` parameter, representing the scaling factor in image super resolution; for models that do not support super resolution rebuild tasks, `sr_factor` is set to `None`.
+- Specify the loss function used during model training through the `losses` parameter. `losses` needs to be a callable object or dictionary. `losses` specified manually must have the same format as the the subclass `default_loss()` method.
+- The `min_max` parameter can specify the numerical range of model input and output. If `None`, the default range of values for the class is used.
+- Different subclasses support model-related input parameters. For details, you can refer to [model definitions](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/rs_models/res) and [trainer definitions](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/tasks/restorer.py).
+
+### Initialize `BaseSegmenter` Subclass Object
+
+- The parameters `in_channels`, `num_classes`, and  `use_mixed_loss` are generally supported, indicating the number of input channels, the number of output categories, and whether the preset mixing loss is used.
+- `use_mixed_loss` will be deprecated in the future, so it is not recommended.
+- Specify the loss function used during model training through the `losses` parameter. `losses` needs to be a dictionary, where the values for the keys `types` and `coef` are two equal-length lists representing the loss function object (a callable object) and the weight of the loss function, respectively. For example: `losses={'types': [LossType1(), LossType2()], 'coef': [1.0, 0.5]}`. It is equivalent to calculating the following loss function in the training process: `1.0*LossType1()(logits, labels)+0.5*LossType2()(logits, labels)`, where `logits` and `labels` are model output and ground-truth labels, respectively.
+- Different subclasses support model-related input parameters. For details, you can refer to [model definitions](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/rs_models/seg) and [trainer definitions](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/tasks/segmentor.py).
+
+## `train()`
+
+### `BaseChangeDetector.train()`
+
+Interface format:
+
+```python
+def train(self,
+          num_epochs,
+          train_dataset,
+          train_batch_size=2,
+          eval_dataset=None,
+          optimizer=None,
+          save_interval_epochs=1,
+          log_interval_steps=2,
+          save_dir='output',
+          pretrain_weights=None,
+          learning_rate=0.01,
+          lr_decay_power=0.9,
+          early_stop=False,
+          early_stop_patience=5,
+          use_vdl=True,
+          resume_checkpoint=None):
+```
+
+The meanings of each parameter are as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`num_epochs`|`int`|Number of epochs to train.||
+|`train_dataset`|`paddlers.datasets.CDDataset`|Training dataset.||
+|`train_batch_size`|`int`|Batch size used during training.|`2`|
+|`eval_dataset`|`paddlers.datasets.CDDataset` \| `None`|Validation dataset.|`None`|
+|`optimizer`|`paddle.optimizer.Optimizer` \| `None`|Optimizer used during training. If `None`, the optimizer defined by default is used.|`None`|
+|`save_interval_epochs`|`int`|Number of intervals epochs of the model stored during training.|`1`|
+|`log_interval_steps`|`int`|Number of steps (i.e., the number of iterations) between printing logs during training.|`2`|
+|`save_dir`|`str`|Path to save the model.|`'output'`|
+|`pretrain_weights`|`str` \| `None`|Name/path of the pre-training weight. If `None`, the pre-training weight is not used.|`None`|
+|`learning_rate`|`float`|Learning rate used during training, for default optimizer.|`0.01`|
+|`lr_decay_power`|`float`|Learning rate attenuation coefficient, for default optimizer.|`0.9`|
+|`early_stop`|`bool`|Whether the early stop policy is enabled during training.|`False`|
+|`early_stop_patience`|`int`|`patience` parameters when the early stop policy is enabled (refer to [`EarlyStop`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/utils/utils.py)).|`5`|
+|`use_vdl`|`bool`|Whether to enable VisualDL log.|`True`|
+|`resume_checkpoint`|`str` \| `None`|Checkpoint path. PaddleRS supports continuing training from checkpoints (including model weights and optimizer weights stored during previous training), but note that `resume_checkpoint` and `pretrain_weights` must not be set to values other than `None` at the same time.|`None`|
+
+### `BaseClassifier.train()`
+
+Interface format:
+
+```python
+def train(self,
+          num_epochs,
+          train_dataset,
+          train_batch_size=2,
+          eval_dataset=None,
+          optimizer=None,
+          save_interval_epochs=1,
+          log_interval_steps=2,
+          save_dir='output',
+          pretrain_weights='IMAGENET',
+          learning_rate=0.1,
+          lr_decay_power=0.9,
+          early_stop=False,
+          early_stop_patience=5,
+          use_vdl=True,
+          resume_checkpoint=None):
+```
+
+The meanings of each parameter are as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`num_epochs`|`int`|Number of epochs to train.||
+|`train_dataset`|`paddlers.datasets.ClasDataset`|Training dataset.||
+|`train_batch_size`|`int`|Batch size used during training.|`2`|
+|`eval_dataset`|`paddlers.datasets.ClasDataset` \| `None`|Validation dataset.|`None`|
+|`optimizer`|`paddle.optimizer.Optimizer` \| `None`|Optimizer used during training. If `None`, the optimizer defined by default is used.|`None`|
+|`save_interval_epochs`|`int`|Number of intervals epochs of the model stored during training.|`1`|
+|`log_interval_steps`|`int`|Number of steps (i.e., the number of iterations) between printing logs during training.|`2`|
+|`save_dir`|`str`|Path to save the model.|`'output'`|
+|`pretrain_weights`|`str` \| `None`|Name/path of the pre-training weight. If `None`, the pre-training weight is not used.|`'IMAGENET'`|
+|`learning_rate`|`float`|Learning rate used during training, for default optimizer.|`0.1`|
+|`lr_decay_power`|`float`|Learning rate attenuation coefficient, for default optimizer.|`0.9`|
+|`early_stop`|`bool`|Whether the early stop policy is enabled during training.|`False`|
+|`early_stop_patience`|`int`|`patience` parameters when the early stop policy is enabled (refer to [`EarlyStop`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/utils/utils.py)).|`5`|
+|`use_vdl`|`bool`|Whether to enable VisualDL log.|`True`|
+|`resume_checkpoint`|`str` \| `None`|Checkpoint path. PaddleRS supports continuing training from checkpoints (including model weights and optimizer weights stored during previous training), but note that `resume_checkpoint` and `pretrain_weights` must not be set to values other than `None` at the same time.|`None`|
+
+### `BaseDetector.train()`
+
+Interface format:
+
+```python
+def train(self,
+          num_epochs,
+          train_dataset,
+          train_batch_size=64,
+          eval_dataset=None,
+          optimizer=None,
+          save_interval_epochs=1,
+          log_interval_steps=10,
+          save_dir='output',
+          pretrain_weights='IMAGENET',
+          learning_rate=.001,
+          warmup_steps=0,
+          warmup_start_lr=0.0,
+          lr_decay_epochs=(216, 243),
+          lr_decay_gamma=0.1,
+          metric=None,
+          use_ema=False,
+          early_stop=False,
+          early_stop_patience=5,
+          use_vdl=True,
+          resume_checkpoint=None):
+```
+
+The meanings of each parameter are as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`num_epochs`|`int`|Number of epochs to train.||
+|`train_dataset`|`paddlers.datasets.COCODetDataset` \| `paddlers.datasets.VOCDetDataset` |Training dataset.||
+|`train_batch_size`|`int`|Batch size used during training.(For multi-card training, total batch size for all equipment).|`64`|
+|`eval_dataset`|`paddlers.datasets.COCODetDataset` \| `paddlers.datasets.VOCDetDataset` \| `None`|Validation dataset.|`None`|
+|`optimizer`|`paddle.optimizer.Optimizer` \| `None`|Optimizer used during training. If `None`, the optimizer defined by default is used.|`None`|
+|`save_interval_epochs`|`int`|Number of intervals epochs of the model stored during training.|`1`|
+|`log_interval_steps`|`int`|Number of steps (i.e., the number of iterations) between printing logs during training.|`10`|
+|`save_dir`|`str`|Path to save the model.|`'output'`|
+|`pretrain_weights`|`str` \| `None`|Name/path of the pre-training weight. If `None`, the pre-training weight is not used.|`'IMAGENET'`|
+|`learning_rate`|`float`|Learning rate used during training, for default optimizer.|`0.001`|
+|`warmup_steps`|`int`|Number of [warm-up](https://www.mdpi.com/2079-9292/10/16/2029/htm) rounds used by the default optimizer.|`0`|
+|`warmup_start_lr`|`int`|Default initial learning rate used by the warm-up phase of the optimizer.|`0`|
+|`lr_decay_epochs`|`list` \| `tuple`|Milestones of learning rate decline of the default optimizer, in terms of epoch. That is, which epoch the decay of the learning rate occurs.|`(216, 243)`|
+|`lr_decay_gamma`|`float`|Learning rate attenuation coefficient, for default optimizer.|`0.1`|
+|`metric`|`str` \| `None`|Evaluation metrics, can be `'VOC'`、`COCO` or `None`. If `None`, the evaluation index to be used is automatically determined according to the format of the dataset.|`None`|
+|`use_ema`|`bool`|Whether to enable [exponential moving average strategy](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/models/ppdet/optimizer.py) to update model weight parameters.|`False`|
+|`early_stop`|`bool`|Whether the early stop policy is enabled during training.|`False`|
+|`early_stop_patience`|`int`|`patience` parameters when the early stop policy is enabled (refer to [`EarlyStop`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/utils/utils.py)).|`5`|
+|`use_vdl`|`bool`|Whether to enable VisualDL log.|`True`|
+|`resume_checkpoint`|`str` \| `None`|Checkpoint path. PaddleRS supports continuing training from checkpoints (including model weights and optimizer weights stored during previous training), but note that `resume_checkpoint` and `pretrain_weights` must not be set to values other than `None` at the same time.|`None`|
+
+### `BaseRestorer.train()`
+
+Interface format:
+
+```python
+def train(self,
+          num_epochs,
+          train_dataset,
+          train_batch_size=2,
+          eval_dataset=None,
+          optimizer=None,
+          save_interval_epochs=1,
+          log_interval_steps=2,
+          save_dir='output',
+          pretrain_weights='CITYSCAPES',
+          learning_rate=0.01,
+          lr_decay_power=0.9,
+          early_stop=False,
+          early_stop_patience=5,
+          use_vdl=True,
+          resume_checkpoint=None):
+```
+
+The meanings of each parameter are as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`num_epochs`|`int`|Number of epochs to train.||
+|`train_dataset`|`paddlers.datasets.ResDataset`|Training dataset.||
+|`train_batch_size`|`int`|Batch size used during training.|`2`|
+|`eval_dataset`|`paddlers.datasets.ResDataset` \| `None`|Validation dataset.|`None`|
+|`optimizer`|`paddle.optimizer.Optimizer` \| `None`|Optimizer used during training. If `None`, the optimizer defined by default is used.|`None`|
+|`save_interval_epochs`|`int`|Number of intervals epochs of the model stored during training.|`1`|
+|`log_interval_steps`|`int`|Number of steps (i.e., the number of iterations) between printing logs during training.|`2`|
+|`save_dir`|`str`|Path to save the model.|`'output'`|
+|`pretrain_weights`|`str` \| `None`|Name/path of the pre-training weight. If `None`, the pre-training weight is not used.|`'CITYSCAPES'`|
+|`learning_rate`|`float`|Learning rate used during training, for default optimizer.|`0.01`|
+|`lr_decay_power`|`float`|Learning rate attenuation coefficient, for default optimizer.|`0.9`|
+|`early_stop`|`bool`|Whether the early stop policy is enabled during training.|`False`|
+|`early_stop_patience`|`int`|`patience` parameters when the early stop policy is enabled (refer to [`EarlyStop`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/utils/utils.py)).|`5`|
+|`use_vdl`|`bool`|Whether to enable VisualDL log.|`True`|
+|`resume_checkpoint`|`str` \| `None`|Checkpoint path. PaddleRS supports continuing training from checkpoints (including model weights and optimizer weights stored during previous training), but note that `resume_checkpoint` and `pretrain_weights` must not be set to values other than `None` at the same time.|`None`|
+
+### `BaseSegmenter.train()`
+
+Interface format:
+
+```python
+def train(self,
+          num_epochs,
+          train_dataset,
+          train_batch_size=2,
+          eval_dataset=None,
+          optimizer=None,
+          save_interval_epochs=1,
+          log_interval_steps=2,
+          save_dir='output',
+          pretrain_weights='CITYSCAPES',
+          learning_rate=0.01,
+          lr_decay_power=0.9,
+          early_stop=False,
+          early_stop_patience=5,
+          use_vdl=True,
+          resume_checkpoint=None):
+```
+
+The meanings of each parameter are as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`num_epochs`|`int`|Number of epochs to train.||
+|`train_dataset`|`paddlers.datasets.SegDataset`|Training dataset.||
+|`train_batch_size`|`int`|Batch size used during training.|`2`|
+|`eval_dataset`|`paddlers.datasets.SegDataset` \| `None`|Validation dataset.|`None`|
+|`optimizer`|`paddle.optimizer.Optimizer` \| `None`|Optimizer used during training. If `None`, the optimizer defined by default is used.|`None`|
+|`save_interval_epochs`|`int`|Number of intervals epochs of the model stored during training.|`1`|
+|`log_interval_steps`|`int`|Number of steps (i.e., the number of iterations) between printing logs during training.|`2`|
+|`save_dir`|`str`|Path to save the model.|`'output'`|
+|`pretrain_weights`|`str` \| `None`|Name/path of the pre-training weight. If `None`, the pre-training weight is not used.|`'CITYSCAPES'`|
+|`learning_rate`|`float`|Learning rate used during training, for default optimizer.|`0.01`|
+|`lr_decay_power`|`float`|Learning rate attenuation coefficient, for default optimizer.|`0.9`|
+|`early_stop`|`bool`|Whether the early stop policy is enabled during training.|`False`|
+|`early_stop_patience`|`int`|`patience` parameters when the early stop policy is enabled (refer to [`EarlyStop`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/utils/utils.py)).|`5`|
+|`use_vdl`|`bool`|Whether to enable VisualDL log.|`True`|
+|`resume_checkpoint`|`str` \| `None`|Checkpoint path. PaddleRS supports continuing training from checkpoints (including model weights and optimizer weights stored during previous training), but note that `resume_checkpoint` and `pretrain_weights` must not be set to values other than `None` at the same time.|`None`|
+
+## `evaluate()`
+
+### `BaseChangeDetector.evaluate()`
+
+Interface format:
+
+```python
+def evaluate(self, eval_dataset, batch_size=1, return_details=False):
+```
+
+The meanings of each parameter are as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`eval_dataset`|`paddlers.datasets.CDDataset`|Validation dataset.||
+|`batch_size`|`int`|Batch size used in the evaluation (for multi-card training, the batch size is totaled for all devices).|`1`|
+|`return_details`|`bool`|Whether to return detailed information.|`False`|
+
+If `return_details` is `False`(default), output a `collections.OrderedDict` object. For the 2-category change detection task, the output contains the following key-value pairs:
+
+```
+{"iou": the IoU metric of the change class,
+ "f1": the F1 score of the change class,
+ "oacc": overall precision (accuracy),
+ "kappa": kappa coefficient}
+```
+
+For the multi-category change detection task, the output contains the following key-value pairs:
+
+```
+{"miou": mIoU metric,
+ "category_iou": IoU metric of each category,
+ "oacc": overall precision (accuracy),
+ "category_acc": precision of each category,
+ "kappa": kappa coefficient,
+ "category_F1score": F1 score of each category}
+```
+
+If `return_details` is `True`, return a binary set of two dictionaries in which the first element is the metric mentioned above and the second element is a dictionary containing only one key, and the value of the `'confusion_matrix'` key is the confusion matrix stored in the python build-in list.
+
+
+
+### `BaseClassifier.evaluate()`
+
+Interface format:
+
+```python
+def evaluate(self, eval_dataset, batch_size=1, return_details=False):
+```
+
+The meanings of each parameter are as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`eval_dataset`|`paddlers.datasets.ClasDataset`|Validation dataset.||
+|`batch_size`|`int`|Batch size used in the evaluation (for multi-card training, the batch size is totaled for all devices).|`1`|
+|`return_details`|`bool`|*Do not manually set this parameter in the current version.*|`False`|
+
+output a `collections.OrderedDict` object, including the following key-value pairs:
+
+```
+{"top1": top1 accuracy,
+ "top5": top5 accuracy}
+```
+
+### `BaseDetector.evaluate()`
+
+Interface format:
+
+```python
+def evaluate(self,
+             eval_dataset,
+             batch_size=1,
+             metric=None,
+             return_details=False):
+```
+
+The meanings of each parameter are as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`eval_dataset`|`paddlers.datasets.COCODetDataset` \| `paddlers.datasets.VOCDetDataset`|Validation dataset.||
+|`batch_size`|`int`|Batch size used in the evaluation (for multi-card training, the batch size is totaled for all devices).|`1`|
+|`metric`|`str` \| `None`|Evaluation metrics, can be `'VOC'`、`COCO` or `None`. If `None`, the evaluation index to be used is automatically determined according to the format of the dataset.|`None`|
+|`return_details`|`bool`|Whether to return detailed information.|`False`|
+
+If `return_details` is `False`(default), return a `collections.OrderedDict` object, including the following key-value pairs:
+
+```
+{"bbox_mmap": mAP of predicted result}
+```
+
+If `return_details` is `True`, return a binary set of two dictionaries, where the first dictionary is the above evaluation index and the second dictionary contains the following three key-value pairs:
+
+```
+{"gt": dataset annotation information,
+ "bbox": predicted object box information,
+ "mask": predicted mask information}
+```
+
+### `BaseRestorer.evaluate()`
+
+Interface format:
+
+```python
+def evaluate(self, eval_dataset, batch_size=1, return_details=False):
+```
+
+The meanings of each parameter are as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`eval_dataset`|`paddlers.datasets.ResDataset`|Validation dataset.||
+|`batch_size`|`int`|Batch size used in the evaluation (for multi-card training, the batch size is totaled for all devices).|`1`|
+|`return_details`|`bool`|*Do not manually set this parameter in the current version.*|`False`|
+
+Output a `collections.OrderedDict` object, including the following key-value pairs:
+
+```
+{"psnr": PSNR metric,
+ "ssim": SSIM metric}
+```
+
+### `BaseSegmenter.evaluate()`
+
+Interface format:
+
+```python
+def evaluate(self, eval_dataset, batch_size=1, return_details=False):
+```
+
+The meanings of each parameter are as follows:
+
+|Parameter Name|Type|Parameter Description|Default Value|
+|-------|----|--------|-----|
+|`eval_dataset`|`paddlers.datasets.SegDataset`|Validation dataset.||
+|`batch_size`|`int`|Batch size used in the evaluation (for multi-card training, the batch size is totaled for all devices).|`1`|
+|`return_details`|`bool`|Whether to return detailed information.|`False`|
+
+If `return_details` is `False`(default), return a `collections.OrderedDict` object, including the following key-value pairs:
+
+```
+{"miou": mIoU metric,
+ "category_iou": IoU metric of each category,
+ "oacc": overall precision (accuracy),
+ "category_acc": precision of each category,
+ "kappa": kappa coefficient,
+ "category_F1score": F1 score of each category}
+```
+
+If `return_details` is `True`, return a binary set of two dictionaries in which the first element is the metric mentioned above and the second element is a dictionary containing only one key, and the value of the `'confusion_matrix'` key is the confusion matrix stored in the python build-in list.

+ 591 - 0
docs/cases/csc_cd_en.md

@@ -0,0 +1,591 @@
+# [The 11th "China Software Cup" Baidu Remote Sensing Competition: Change Detection Function](https://aistudio.baidu.com/aistudio/projectdetail/3684588)
+
+## Competition Introduction
+
+"China Software Cup" College Students Software Design Competition is a public welfare competition for chinese students. It is a competition in the 2021 National College Students Competition list. The competition is co-sponsored by the Ministry of Industry and Information Technology, the Ministry of Education and the People's Government of Jiangsu Province. It is committed to guiding Chinese school students to actively participate in software scientific research activities, enhancing self-innovation and practical ability, and cultivating more high-end and outstanding talents for Chinese software and information technology service industry. In 2022, Baidu Paddle will host Group A and Group B. This race is called Group A.
+
+[Competition official website link](https://aistudio.baidu.com/aistudio/competition/detail/151/0/introduction)
+
+### Competition Background
+
+Mastering the utilization of land resources and the types of land cover is an important content of the census and monitoring of geographical national conditions. Efficient acquisition of accurate and objective land use and monitoring of land change can provide support for national and local geographic information decision-making. With the development of remote sensing and sensor technology, especially the popularization of multi-temporal high-resolution remote sensing image data, we can master the subtle changes of any global surface without leaving our homes.
+
+At present, the field of remote sensing has stepped into the fast lane of high resolution image, and the demand for remote sensing data analysis and application services is increasing with each passing day. The traditional method is poor in characterizing features of high-resolution satellite remote sensing images and relies heavily on manual experience. With the rise of artificial intelligence technology, especially the image recognition method based on deep learning has been greatly developed, and related technologies have also promoted the changes in the field of remote sensing. Compared with the traditional visual interpretation method based on human crowd tactics, remote sensing image recognition technology based on deep learning can automatically analyze the types of ground objects in images, showing great potential in terms of accuracy and efficiency.
+
+The problem from baidu OARS and [buaa LEVIR team](http://levir.buaa.edu.cn/) set together, require participants to use baidu aistudio platform for training framework based on the localization of ai -- baidu PaddlePaddle framework for development, Design and develop a web system which can realize automatic interpretation of remote sensing images by deep learning technology.
+
+### Task Description
+
+In the part of change detection, participants are required to realize building change detection in multi-temporal images by using the training data provided. Specifically, the task of building change detection in multi-temporal remote sensing images is given two remote sensing images of the same position (geographic registration) taken at different times, which requires locating the area of building change.
+
+Reference link:[What is remote sensing image change detection?](https://baike.baidu.com/item/%E5%8F%98%E5%8C%96%E6%A3%80%E6%B5%8B/8636264)
+
+### Dataset Introduction
+
+See [Dataset Link](https://aistudio.baidu.com/aistudio/datasetdetail/134796) and [Competition introduction](https://aistudio.baidu.com/aistudio/competition/detail/151/0/task-definition).
+
+## Data Preprocessing
+
+```python
+# Divide the training set/verification set and generate a list of file names
+
+import random
+import os.path as osp
+from glob import glob
+
+
+# Random number generator seed
+RNG_SEED = 114514
+# Adjust this parameter to control the proportion of training set data
+TRAIN_RATIO = 0.95
+# dataset path
+DATA_DIR = '/home/aistudio/data/data134796/dataset/'
+
+
+def write_rel_paths(phase, names, out_dir, prefix=''):
+    """The relative path of a file is stored in a txt file"""
+    with open(osp.join(out_dir, phase+'.txt'), 'w') as f:
+        for name in names:
+            f.write(
+                ' '.join([
+                    osp.join(prefix, 'A', name),
+                    osp.join(prefix, 'B', name),
+                    osp.join(prefix, 'label', name)
+                ])
+            )
+            f.write('\n')
+
+
+random.seed(RNG_SEED)
+
+# Randomly divide the training set/verification set
+names = list(map(osp.basename, glob(osp.join(DATA_DIR, 'train', 'label', '*.png'))))
+# Sort file names to ensure consistent results over multiple runs
+names.sort()
+random.shuffle(names)
+len_train = int(len(names)*TRAIN_RATIO)
+write_rel_paths('train', names[:len_train], DATA_DIR, prefix='train')
+write_rel_paths('val', names[len_train:], DATA_DIR, prefix='train')
+write_rel_paths(
+    'test',
+    map(osp.basename, glob(osp.join(DATA_DIR, 'test', 'A', '*.png'))),
+    DATA_DIR,
+    prefix='test'
+)
+
+print("Dataset partitioning completed.")
+
+```
+
+## Model Training and Inference
+
+This project uses [PaddleRS](https://github.com/PaddlePaddle/PaddleRS) suite building model training and inference framework. PaddleRS is a remote sensing processing platform developed based on flying paddlers, which supports common remote sensing tasks such as remote sensing image classification, target detection, image segmentation and change detection, and can help developers more easily complete the whole process of remote sensing deep learning applications from training to deployment. In terms of change detection, PaddleRS currently supports nine state-of-the-art (SOTA) models, and complex training and reasoning processes are encapsulated in several apis to provide an out-of-the-box user experience.
+
+```python
+# Installing third-party libraries
+!pip install scikit-image > /dev/null
+!pip install matplotlib==3.4 > /dev/null
+
+# Install PaddleRS (cached version on aistudio)
+!unzip -o -d /home/aistudio/ /home/aistudio/data/data135375/PaddleRS-develop.zip > /dev/null
+!mv /home/aistudio/PaddleRS-develop /home/aistudio/PaddleRS
+!pip install -e /home/aistudio/PaddleRS > /dev/null
+# Because 'sys.path' may not be updated in time, choose to manually update here
+import sys
+sys.path.append('/home/aistudio/PaddleRS')
+```
+
+```python
+# Import some of the libraries you need
+
+import random
+import os
+import os.path as osp
+from copy import deepcopy
+from functools import partial
+
+import numpy as np
+import paddle
+import paddlers as pdrs
+from paddlers import transforms as T
+from PIL import Image
+from skimage.io import imread, imsave
+from tqdm import tqdm
+from matplotlib import pyplot as plt
+```
+
+```python
+# Define global variables
+# You can adjust the experimental hyperparameters here
+
+# Random seed
+SEED = 1919810
+
+# Dataset path
+DATA_DIR = '/home/aistudio/data/data134796/dataset/'
+# Experimental path. The output model weights and results are saved in the experimental directory
+EXP_DIR = '/home/aistudio/exp/'
+# Path to save the best model
+BEST_CKP_PATH = osp.join(EXP_DIR, 'best_model', 'model.pdparams')
+
+# The number of epochs trained
+NUM_EPOCHS = 100
+# Save the model weight parameter every N epochs
+SAVE_INTERVAL_EPOCHS = 10
+# Initial learning rate
+LR = 0.001
+# The learning rate decay step (note that the unit is the number of iterations rather than the epoch number), that is, how many iterations decay the learning rate by half
+DECAY_STEP = 1000
+# batch size
+BATCH_SIZE = 16
+# The number of processes used to load data
+NUM_WORKERS = 4
+# Block size
+CROP_SIZE = 256
+# The sliding window step used in the model inference
+STRIDE = 64
+# Original image size
+ORIGINAL_SIZE = (1024, 1024)
+```
+
+```python
+# Fixed random seeds to make experimental results reproducible as much as possible
+
+random.seed(SEED)
+np.random.seed(SEED)
+paddle.seed(SEED)
+```
+
+```python
+# Define some auxiliary functions
+
+def info(msg, **kwargs):
+    print(msg, **kwargs)
+
+
+def warn(msg, **kwargs):
+    print('\033[0;31m'+msg, **kwargs)
+
+
+def quantize(arr):
+    return (arr*255).astype('uint8')
+```
+
+### Model Construction
+
+As a demonstration, BIT-CD[1], a change detection model based on Transformer created by LEVIR Group in 2021, was selected for this project. Please refer to [paper link](https://ieeexplore.ieee.org/document/9491802), Official implementation of the original author please refer to [this link](https://github.com/justchenhao/BIT_CD).
+
+> [1] Hao Chen, Zipeng Qi, and Zhenwei Shi. **Remote Sensing Image Change Detection with Transformers.** *IEEE Transactions on Geoscience and Remote Sensing.*
+
+```python
+# Call the PaddleRS API to build the model
+model = pdrs.tasks.BIT(
+    # Number of model output categories
+    num_classes=2,
+    # Whether to use mixed loss function, the default is to use cross entropy loss function training
+    use_mixed_loss=False,
+    # Number of model input channels
+    in_channels=3,
+    # Backbone network used by the model, supporting 'resnet18' or 'resnet34'
+    backbone='resnet18',
+    # The number of resnet stages in the backbone network
+    n_stages=4,
+    # Whether to use tokenizer to get semantic tokens
+    use_tokenizer=True,
+    # token length
+    token_len=4,
+    # If the tokenizer is not used, the token is obtained using pooling. This parameter sets the pooling mode, with 'max' and 'avg' options corresponding to maximum pooling and average pooling, respectively
+    pool_mode='max',
+    # Width and height of the pooled output feature graph (pooled token length is the square of pool_size)
+    pool_size=2,
+    # Whether to include positional embedding in Transformer encoder
+    enc_with_pos=True,
+    # Number of attention blocks used by the Transformer encoder
+    enc_depth=1,
+    # embedding dimension of each attention head in Transformer encoder
+    enc_head_dim=64,
+    # Number of attention modules used by the Transformer decoder
+    dec_depth=8,
+    # The embedded dimensions of each attention head in the Transformer decoder
+    dec_head_dim=8
+)
+```
+
+### Dataset Construction
+
+```python
+# Build the data transform needed (data augmentation, preprocessing)
+# Compose a variety of transformations using Compose. The transformations contained in Compose will be executed sequentially in sequence
+train_transforms = T.Compose([
+    # Random cutting
+    T.RandomCrop(
+        # The clipping area will be scaled to this size
+        crop_size=CROP_SIZE,
+        # Fix the horizontal to vertical ratio of the clipped area to 1
+        aspect_ratio=[1.0, 1.0],
+        # The ratio of length to width of the cropped area relative to the original image varies within a certain range, with a minimum of 1/5 of the original length to width
+        scaling=[0.2, 1.0]
+    ),
+    # Perform a random horizontal flip with a 50% probability
+    T.RandomHorizontalFlip(prob=0.5),
+    # Perform a random vertical flip with a 50% probability
+    T.RandomVerticalFlip(prob=0.5),
+    # Data normalization to [-1,1]
+    T.Normalize(
+        mean=[0.5, 0.5, 0.5],
+        std=[0.5, 0.5, 0.5]
+    )
+])
+eval_transforms = T.Compose([
+    # In the verification phase, the original size image is input, and the input image is only normalized
+    # The method of data normalization in the verification phase and the training phase must be the same
+    T.Normalize(
+        mean=[0.5, 0.5, 0.5],
+        std=[0.5, 0.5, 0.5]
+    )
+])
+
+# 实例化数据集
+train_dataset = pdrs.datasets.CDDataset(
+    data_dir=DATA_DIR,
+    file_list=osp.join(DATA_DIR, 'train.txt'),
+    label_list=None,
+    transforms=train_transforms,
+    num_workers=NUM_WORKERS,
+    shuffle=True,
+    binarize_labels=True
+)
+eval_dataset = pdrs.datasets.CDDataset(
+    data_dir=DATA_DIR,
+    file_list=osp.join(DATA_DIR, 'val.txt'),
+    label_list=None,
+    transforms=eval_transforms,
+    num_workers=0,
+    shuffle=False,
+    binarize_labels=True
+)
+```
+
+### Model Training
+
+With AI Studio Premium hardware configuration (16G V100) and default hyperparameters, the total training time is about 50 minutes.
+
+If the VisualDL logging function is enabled during training (enabled by default), you can view the visualized results on the data model visualization tab. Set logdir to the vdl_log subdirectory in the `EXP_DIR` directory. A tutorial on using VisualDL in a notebook is available [here](https://ai.baidu.com/ai-doc/AISTUDIO/Dk3e2vxg9#visualdl%E5%B7%A5%E5%85%B7).
+
+It should be noted that PaddleRS used mIoU to evaluate the optimal model on the verification set by default, while F1 scores were selected as the evaluation index by the race official.
+
+In addition, PaddleRS reports indicators for each category in the verification set. Therefore, for category 2 change detection, category_acc, category_F1-score and other indicators have two data items, which are reflected in the form of lists. Since the change detection task focuses on the change classes, it makes more sense to observe and compare the second data item of each metric (the second element of the list).
+
+```python
+# If the lab directory does not exist, create a new one (recursively create the directory)
+if not osp.exists(EXP_DIR):
+    os.makedirs(EXP_DIR)
+```
+
+```python
+# Build the learning rate scheduler and optimizer
+
+# Develop a learning rate attenuation strategy with a fixed step size
+lr_scheduler = paddle.optimizer.lr.StepDecay(
+    LR,
+    step_size=DECAY_STEP,
+    # The learning rate attenuation coefficient, which is specified here to be halved each time
+    gamma=0.5
+)
+# Construct the Adam optimizer
+optimizer = paddle.optimizer.Adam(
+    learning_rate=lr_scheduler,
+    # In PaddleRS, the paddle.nn.Layer type networking can be obtained using the net property of the ChangeDetector object
+    parameters=model.net.parameters()
+)
+```
+
+```python
+# Call the PaddleRS API to implement one-click training
+model.train(
+    num_epochs=NUM_EPOCHS,
+    train_dataset=train_dataset,
+    train_batch_size=BATCH_SIZE,
+    eval_dataset=eval_dataset,
+    optimizer=optimizer,
+    save_interval_epochs=SAVE_INTERVAL_EPOCHS,
+    # Log every number of iterations
+    log_interval_steps=10,  
+    save_dir=EXP_DIR,
+    # Whether to use an early stopping strategy, stopping training early when accuracy does not improve
+    early_stop=False,
+    # Whether to enable VisualDL logging
+    use_vdl=True,
+    # Specify a checkpoint from which to continue training
+    resume_checkpoint=None
+)
+```
+
+### Model Inference
+
+Using AI Studio Premium hardware configuration (16G V100) and default hyperparameters, the total reasoning time is about 3 minutes.
+
+The inference script uses the fixed threshold method to obtain the binary change map from the change probability graph. The default threshold is 0.5, and the threshold can be adjusted according to the actual performance of the model. Of course, you can switch [Otsu](https://baike.baidu.com/item/otsu/16252828?fr=aladdin)、[k-means clustering](https://baike.baidu.com/item/K%E5%9D%87%E5%80%BC%E8%81%9A%E7%B1%BB%E7%AE%97%E6%B3%95/15779627) and other more advanced threshold segmentation algorithms.
+
+The result of model forward inference is stored in the out subdirectory under the `EXP_DIR` directory, and the files in this subdirectory can be packaged, renamed and submitted to the competition system. Please read the [Submission Specification](https://aistudio.baidu.com/aistudio/competition/detail/151/0/submit-result) carefully before submitting your results.
+
+```python
+# Define the data set to be used in the inference
+
+class InferDataset(paddle.io.Dataset):
+    """
+    Change detection inference data set.
+
+    Args:
+        data_dir (str): The directory path to the data set.
+        transforms (paddlers.transforms.Compose): Data transformation operations that need to be performed.
+    """
+
+    def __init__(
+        self,
+        data_dir,
+        transforms
+    ):
+        super().__init__()
+
+        self.data_dir = data_dir
+        self.transforms = deepcopy(transforms)
+
+        pdrs.transforms.arrange_transforms(
+            model_type='changedetector',
+            transforms=self.transforms,
+            mode='test'
+        )
+
+        with open(osp.join(data_dir, 'test.txt'), 'r') as f:
+            lines = f.read()
+            lines = lines.strip().split('\n')
+
+        samples = []
+        names = []
+        for line in lines:
+            items = line.strip().split(' ')
+            items = list(map(pdrs.utils.norm_path, items))
+            item_dict = {
+                'image_t1': osp.join(data_dir, items[0]),
+                'image_t2': osp.join(data_dir, items[1])
+            }
+            samples.append(item_dict)
+            names.append(osp.basename(items[0]))
+
+        self.samples = samples
+        self.names = names
+
+    def __getitem__(self, idx):
+        sample = deepcopy(self.samples[idx])
+        output = self.transforms(sample)
+        return paddle.to_tensor(output[0]), \
+               paddle.to_tensor(output[1])
+
+    def __len__(self):
+        return len(self.samples)
+```
+
+```python
+# Given the large size of the original image, the following classes and functions are related to image block-splicing.
+
+class WindowGenerator:
+    def __init__(self, h, w, ch, cw, si=1, sj=1):
+        self.h = h
+        self.w = w
+        self.ch = ch
+        self.cw = cw
+        if self.h < self.ch or self.w < self.cw:
+            raise NotImplementedError
+        self.si = si
+        self.sj = sj
+        self._i, self._j = 0, 0
+
+    def __next__(self):
+        # Column priority movement(C-order)
+        if self._i > self.h:
+            raise StopIteration
+
+        bottom = min(self._i+self.ch, self.h)
+        right = min(self._j+self.cw, self.w)
+        top = max(0, bottom-self.ch)
+        left = max(0, right-self.cw)
+
+        if self._j >= self.w-self.cw:
+            if self._i >= self.h-self.ch:
+                # Set an invalid value so that the iteration can early stop
+                self._i = self.h+1
+            self._goto_next_row()
+        else:
+            self._j += self.sj
+            if self._j > self.w:
+                self._goto_next_row()
+
+        return slice(top, bottom, 1), slice(left, right, 1)
+
+    def __iter__(self):
+        return self
+
+    def _goto_next_row(self):
+        self._i += self.si
+        self._j = 0
+
+
+def crop_patches(dataloader, ori_size, window_size, stride):
+    """
+    Block the data in 'dataloader'.
+
+    Args:
+        dataloader (paddle.io.DataLoader): Iterable object that produces raw samples (each containing any number of images).
+        ori_size (tuple): The length and width of the original image are expressed in binary form (h,w).
+        window_size (int): Cut the block size.
+        stride (int): The number of pixels that the slide window used by the cut block moves horizontally or vertically at a time.
+
+    Returns:
+        A generator that produces the result of a block for each item in iter('dataloader'). An image is generated by piecing blocks in the batch dimension. For example, when 'ori_size' is 1024 and 'window_size' and 'stride' are both 512, the batch_size of each item returned by 'crop_patches' will be four times the size of the corresponding item in iter('dataloader').
+    """
+
+    for ims in dataloader:
+        ims = list(ims)
+        h, w = ori_size
+        win_gen = WindowGenerator(h, w, window_size, window_size, stride, stride)
+        all_patches = []
+        for rows, cols in win_gen:
+            # NOTE: You cannot use a generator here, or the result will not be as expected because of lazy evaluation
+            patches = [im[...,rows,cols] for im in ims]
+            all_patches.append(patches)
+        yield tuple(map(partial(paddle.concat, axis=0), zip(*all_patches)))
+
+
+def recons_prob_map(patches, ori_size, window_size, stride):
+    """The original dimension image is reconstructed from the cut patches corresponding to 'crop_patches'"""
+    # NOTE: Currently, only batch size 1 can be processed
+    h, w = ori_size
+    win_gen = WindowGenerator(h, w, window_size, window_size, stride, stride)
+    prob_map = np.zeros((h,w), dtype=np.float)
+    cnt = np.zeros((h,w), dtype=np.float)
+    # XXX: Ensure that the win_gen and patches are of the same length. Not checked here
+    for (rows, cols), patch in zip(win_gen, patches):
+        prob_map[rows, cols] += patch
+        cnt[rows, cols] += 1
+    prob_map /= cnt
+    return prob_map
+```
+
+```python
+# If the output directory does not exist, create a new one (recursively create a directory)
+out_dir = osp.join(EXP_DIR, 'out')
+if not osp.exists(out_dir):
+    os.makedirs(out_dir)
+
+# Load the historical optimal weight for the model
+state_dict = paddle.load(BEST_CKP_PATH)
+# The networking object is also accessed through the net property
+model.net.set_state_dict(state_dict)
+
+# Instantiate the test set
+test_dataset = InferDataset(
+    DATA_DIR,
+    # Note that the normalization used during the test phase needs to be the same as during the training
+    T.Compose([
+        T.Normalize(
+            mean=[0.5, 0.5, 0.5],
+            std=[0.5, 0.5, 0.5]
+        )
+    ])
+)
+
+# Create DataLoader
+test_dataloader = paddle.io.DataLoader(
+    test_dataset,
+    batch_size=1,
+    shuffle=False,
+    num_workers=0,
+    drop_last=False,
+    return_list=True
+)
+test_dataloader = crop_patches(
+    test_dataloader,
+    ORIGINAL_SIZE,
+    CROP_SIZE,
+    STRIDE
+)
+```
+
+```python
+# inference process main loop
+info("model inference begin")
+
+model.net.eval()
+len_test = len(test_dataset.names)
+with paddle.no_grad():
+    for name, (t1, t2) in tqdm(zip(test_dataset.names, test_dataloader), total=len_test):
+        pred = model.net(t1, t2)[0]
+        # Take the output of the first (counting from 0) channel of the softmax result as the probability of change
+        prob = paddle.nn.functional.softmax(pred, axis=1)[:,1]
+        # probability map is reconstructed by patch
+        prob = recons_prob_map(prob.numpy(), ORIGINAL_SIZE, CROP_SIZE, STRIDE)
+        # By default, the threshold is set to 0.5, that is, pixels with a change probability greater than 0.5 are classified into change categories
+        out = quantize(prob>0.5)
+
+        imsave(osp.join(out_dir, name), out, check_contrast=False)
+
+info("Completion of model inference")
+
+```
+
+```python
+# Inference result presentation
+# Run this unit repeatedly to see different results
+
+def show_images_in_row(im_paths, fig, title=''):
+    n = len(im_paths)
+    fig.suptitle(title)
+    axs = fig.subplots(nrows=1, ncols=n)
+    for idx, (path, ax) in enumerate(zip(im_paths, axs)):
+        # Remove the scale lines and borders
+        ax.spines['top'].set_visible(False)
+        ax.spines['right'].set_visible(False)
+        ax.spines['bottom'].set_visible(False)
+        ax.spines['left'].set_visible(False)
+        ax.get_xaxis().set_ticks([])
+        ax.get_yaxis().set_ticks([])
+
+        im = imread(path)
+        ax.imshow(im)
+
+
+# The number of samples to be displayed
+num_imgs_to_show = 4
+# Random sampling
+chosen_indices = random.choices(range(len_test), k=num_imgs_to_show)
+
+# Refer https://stackoverflow.com/a/68209152
+fig = plt.figure(constrained_layout=True)
+fig.suptitle("Inference Results")
+
+subfigs = fig.subfigures(nrows=3, ncols=1)
+
+# Read the first phase image
+im_paths = [osp.join(DATA_DIR, test_dataset.samples[idx]['image_t1']) for idx in chosen_indices]
+show_images_in_row(im_paths, subfigs[0], title='Image 1')
+
+# Read the second phase image
+im_paths = [osp.join(DATA_DIR, test_dataset.samples[idx]['image_t2']) for idx in chosen_indices]
+show_images_in_row(im_paths, subfigs[1], title='Image 2')
+
+# Read the change image
+im_paths = [osp.join(out_dir, test_dataset.names[idx]) for idx in chosen_indices]
+show_images_in_row(im_paths, subfigs[2], title='Change Map')
+
+# Render result
+fig.canvas.draw()
+Image.frombytes('RGB', fig.canvas.get_width_height(), fig.canvas.tostring_rgb())
+```
+
+![output_23_0](https://user-images.githubusercontent.com/71769312/161358173-552a7cca-b5b5-4e5e-8d10-426f40df530b.png)
+
+## Reference Material
+
+- [Introduction to remote sensing data](https://github.com/PaddlePaddle/PaddleRS/blob/develop/docs/data/rs_data.md)
+- [PaddleRS Document](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tutorials/train/README.md)

+ 224 - 0
docs/cases/sr_seg_en.md

@@ -0,0 +1,224 @@
+# [Use Image Super-Resolution to Improve the Segmentation Accuracy of Low Resolution UAV Images](https://aistudio.baidu.com/aistudio/projectdetail/3696814)
+
+## 1 Project Background
+
+- I wrote a project recently: [PaddleSeg: Segmentation of aero remote sensing images using the Transfomer model](https://aistudio.baidu.com/aistudio/projectdetail/3565870), The PaddleSeg module was used to train Transfomer semantic segmentation models, and the transfomer **mIOU reached 74.50%** in the UDD6 data set, compared with 73.18% in the original paper higher **1.32%** . The training results are as follows: car: red; road: light blue; vegetation: dark blue; building facade: bright green; building roof: purple; other: burnt green.
+
+```python
+%cd /home/aistudio/
+import matplotlib.pyplot as plt
+from PIL import Image
+
+output = Image.open(r"work/example/Seg/UDD6_result/added_prediction/000161.JPG")
+
+plt.figure(figsize=(18, 12))  # Set window size
+plt.imshow(output), plt.axis('off')
+```
+
+![output_1_2](https://user-images.githubusercontent.com/71769312/161358238-5dc85c26-de33-4552-83ea-ad9936a5c85a.png)
+
+- The results of the training were very good. The UDD6 data was collected from four cities of Beijing, Huludao, Cangzhou and Zhengzhou with DJI Spirit Four UAV at a height of 60m-100m. However, **In the actual production process, the city, the altitude of the flight, the quality of the image will change**
+- A larger area of data can be obtained in the same time with the increase of flight altitude, but the resolution will be reduced. **For low-quality data, the prediction effect of directly using the previously trained data is not ideal, and it will be a large workload to mark the data and train the model.** The solution is to improve the generalization ability of the model. Also consider using image super-resolution to reconstruct low-quality drone images and then make predictions
+- In this project, the UAV remote sensing image super-resolution module provided by PaddleRS was used to carry out the real low-quality UAV image data **super-resolution**, and then the segformer model trained by UDD6 was used to predict, and the low-resolution model was compared with that directly used. Index cannot be calculated because low quality data is not marked. However, human eyes judged that the prediction results after the super-resolution were better. **The left side was the artificially labeled label, the middle was the prediction result of low resolution, and the right side was the result after the super resolution reconstruction**
+
+```python
+img = Image.open(r"work/example/Seg/gt_result/data_05_2_14.png")
+lq = Image.open(r"work/example/Seg/lq_result/added_prediction/data_05_2_14.png")
+sr = Image.open(r"work/example/Seg/sr_result/added_prediction/data_05_2_14.png")
+
+plt.figure(figsize=(18, 12))
+plt.subplot(1,3,1), plt.title('GT')
+plt.imshow(img), plt.axis('off')
+plt.subplot(1,3,2), plt.title('predict_LR')
+plt.imshow(lq), plt.axis('off')
+plt.subplot(1,3,3), plt.title('predict_SR')
+plt.imshow(sr), plt.axis('off')
+plt.show()
+```
+
+![output_3_0](https://user-images.githubusercontent.com/71769312/161358300-b85cdda4-7d1f-40e7-a39b-74b2cd5347b6.png)
+
+## 2 Data Introduction and Presentation
+- The data used was collected by DJI Spirit Four UAV in **Shanghai, flying at an altitude of 300m**. The weather at the time of collection was normal, and the quality was not high, you can see the following examples. Since it is only to show the prediction effect after super-resolution reconstruction, we only annotate 5 photos briefly. **After all, it is really laborious to annotate data!** It would be nice to be able to predict your own data using models trained in open data sets.
+- Part of the annotated data is shown below
+
+```python
+add_lb = Image.open(r"work/example/Seg/gt_result/data_05_2_19.png")
+lb = Image.open(r"work/example/Seg/gt_label/data_05_2_19.png")
+img = Image.open(r"work/ValData/DJI300/data_05_2_19.png")
+
+plt.figure(figsize=(18, 12))
+plt.subplot(1,3,1), plt.title('image')
+plt.imshow(img), plt.axis('off')
+plt.subplot(1,3,2), plt.title('label')
+plt.imshow(lb), plt.axis('off')
+plt.subplot(1,3,3), plt.title('add_label')
+plt.imshow(add_lb), plt.axis('off')
+plt.show()
+```
+
+![output_5_0](https://user-images.githubusercontent.com/71769312/161358312-3c16cbb0-1162-4fbe-b3d6-9403502aefef.png)
+
+## 3 Unmanned Aerial Vehicle Remote Sensing Image Super-Resolution
+- Since PaddleRS provides a pre-trained super-resolution model, this step is mainly divided into the following two steps:
+    - Prepare for PaddleRS and set the environment
+    - The super-resolution prediction interface in PaddleRS was called to carry out the **super-resolution reconstruction** for the low resolution UAV image
+
+```python
+# Clone the repository from github
+!git clone https://github.com/PaddlePaddle/PaddleRS.git
+```
+
+```python
+# Install dependency, about a minute or so
+%cd PaddleRS/
+!pip install -r requirements.txt
+```
+
+```python
+# For image super-resolution processing, the model used is DRN
+import os
+import paddle
+import numpy as np
+from PIL import Image
+from paddlers.models.ppgan.apps.drn_predictor import DRNPredictor
+
+# The folder where the prediction results are output
+output = r'../work/example'
+# Low resolution image location to be input
+input_dir = r"../work/ValData/DJI300"
+
+paddle.device.set_device("gpu:0")  # if cpu, use paddle.device.set_device("cpu")
+predictor = DRNPredictor(output)  # instantiation
+
+filenames = [f for f in os.listdir(input_dir) if f.endswith('.png')]
+for filename in filenames:
+    imgPath = os.path.join(input_dir, filename)  
+    predictor.run(imgPath)  # prediction
+```
+
+- The results of super-resolution reconstruction before and after comparison
+
+```python
+# visualization
+import os
+import matplotlib.pyplot as plt
+%matplotlib inline
+
+lq_dir = r"../work/ValData/DJI300"  # Low resolution image folder
+sr_dir = r"../work/example/DRN"  # super-resolution image folder
+img_list = [f for f in os.listdir(lq_dir) if f.endswith('.png')]
+show_num = 3  # How many pairs of images are shown
+for i in range(show_num):
+    lq_box = (100, 100, 175, 175)
+    sr_box = (400, 400, 700, 700)
+    filename = img_list[i]
+    image = Image.open(os.path.join(lq_dir, filename)).crop(lq_box)  # Read low resolution images
+    sr_img = Image.open(os.path.join(sr_dir, filename)).crop(sr_box)  # Read super-resolution images
+
+    plt.figure(figsize=(12, 8))
+    plt.subplot(1,2,1), plt.title('Input')
+    plt.imshow(image), plt.axis('off')
+    plt.subplot(1,2,2), plt.title('Output')
+    plt.imshow(sr_img), plt.axis('off')
+    plt.show()
+```
+
+![output_11_0](https://user-images.githubusercontent.com/71769312/161358324-c45d750d-b47e-4201-b70c-3c374498fd86.png)
+
+![output_11_1](https://user-images.githubusercontent.com/71769312/161358335-0b85035e-0a9d-4b5a-8d0c-14ecaeffd947.png)
+
+![output_11_2](https://user-images.githubusercontent.com/71769312/161358342-d2875098-cb9b-4bc2-99b0-bcab4c1bc5e1.png)
+
+## 4 Comparison of Image Segmentation Effect Before and After Super-Resolution
+
+- The model used was segformer_b3, which was trained 40,000 times with the UDD6 dataset
+- The best performing models and.yml files have been placed in the work folder
+- Run the following command to make predictions about the images in the specified folder
+- Firstly, the model is used to predict the low-quality UAV data, and then the image reconstructed by the super-resolution is used to predict. Finally, the prediction effect is compared
+
+```python
+%cd ..
+# clone PaddleSeg
+!git clone https://gitee.com/paddlepaddle/PaddleSeg
+```
+
+```python
+# install packages
+%cd /home/aistudio/PaddleSeg
+!pip install  -r requirements.txt
+```
+
+```python
+# Low resolution drone images are predicted
+!python predict.py \
+       --config ../work/segformer_b3_UDD.yml \
+       --model_path ../work/best_model/model.pdparams \
+       --image_path ../work/ValData/DJI300 \
+       --save_dir ../work/example/Seg/lq_result
+```
+
+```python
+# The image reconstructed by DRN was predicted
+!python predict.py \
+       --config ../work/segformer_b3_UDD.yml \
+       --model_path ../work/best_model/model.pdparams \
+       --image_path ../work/example/DRN \
+       --save_dir ../work/example/Seg/sr_result
+```
+
+**Prediction Result**
+- The colors are as follows:
+
+|   Kind | Color   |
+|----------|---------|
+|  **Others**  |  Burnt green  |
+| Building facade |  Bright green  |
+|  **Road**  |  Light blue  |
+| Vegetation |  Dark blue  |
+|  **Car** |  Red  |
+| Roof |  Purple  |
+
+- Since only five images are marked, only five images' results are shown, and the remaining prediction results are all in the folder `work/example/Seg/`, where the left side is the true value, the middle is the prediction result of low-resolution image, and the right is the prediction result after super-resplution reconstruction
+
+```python
+# Show part of prediction result
+%cd /home/aistudio/
+import matplotlib.pyplot as plt
+from PIL import Image
+import os
+
+img_dir = r"work/example/Seg/gt_result"  # Low resolution image folder
+lq_dir = r"work/example/Seg/lq_result/added_prediction"
+sr_dir = r"work/example/Seg/sr_result/added_prediction"  # Super resolution prediction results image folder
+img_list = [f for f in os.listdir(img_dir) if f.endswith('.png') ]
+for filename in img_list:
+    img = Image.open(os.path.join(img_dir, filename))
+    lq_pred = Image.open(os.path.join(lq_dir, filename))
+    sr_pred = Image.open(os.path.join(sr_dir, filename))
+
+    plt.figure(figsize=(12, 8))
+    plt.subplot(1,3,1), plt.title('GT')
+    plt.imshow(img), plt.axis('off')
+    plt.subplot(1,3,2), plt.title('LR_pred')
+    plt.imshow(lq_pred), plt.axis('off')
+    plt.subplot(1,3,3), plt.title('SR_pred')
+    plt.imshow(sr_pred), plt.axis('off')
+    plt.show()
+
+```
+
+![output_18_1](https://user-images.githubusercontent.com/71769312/161358523-42063419-b490-4fca-b0d4-cb2b05f7f74a.png)
+
+![output_18_2](https://user-images.githubusercontent.com/71769312/161358556-e2f66be4-4758-4c7a-9b3b-636aa2b53215.png)
+
+![output_18_3](https://user-images.githubusercontent.com/71769312/161358599-e74696f3-b374-4d5c-a9f4-7ffaef8938a0.png)
+
+![output_18_4](https://user-images.githubusercontent.com/71769312/161358621-c3c0d225-b67f-4bff-91ba-4be714162584.png)
+
+![output_18_5](https://user-images.githubusercontent.com/71769312/161358643-9aba7db1-6c68-48f2-be53-8eec30f27d60.png)
+
+## 5 Summarize
+- This project called the super resolution reconstruction interface provided by PaddleRS, used the DRN model to reconstruct the low-resolution image acquired in reality, and then segmtioned the reconstructed image. From the results, **the segmentation result of the image after super resolution reconstruction was better**
+- **Deficiency**: compared with low-resolution images, the prediction accuracy after super-resolution reconstruction is improved from the visual point of view, but it does not reach the effect of UDD6 test set. Therefore, **the generalization ability of model also needs to be improved, and super-resolution reconstruction alone is still not good enough**
+- **Future work**: the super resolution reconstruction will be integrated into PaddleRS transform module, which can be called before high-level task prediction to improve image quality, please pay attention to [PaddleRS](https://github.com/PaddlePaddle/PaddleRS)

+ 543 - 0
docs/data/coco_tools_en.md

@@ -0,0 +1,543 @@
+# coco_tools Instructions
+
+## 1 Tool Description
+
+coco_tools is a set of tools provided by PaddleRS for handling COCO annotation files. It is located in the `tools/coco_tools/` directory. Because [pycocotools library] (https://pypi.org/project/pycocotools/) can't install in some environment, PaddleRS provides coco_tools as an alternative for some simple file processing.
+
+*Please note that coco_tools is an experimental function at present, if you encounter problems in the process, please timely feedback to us.*
+
+## 2 Document Description
+
+At present, coco_tools has 6 files, each file and its functions are as follows:
+
+- `json_InfoShow.py`:    Print basic information about each dictionary in the json file.
+- `json_ImgSta.py`:      Collect image information in json files and generate statistical tables and charts.
+- `json_AnnoSta.py`:     Collect annotation information in json files to generate statistical tables and charts.
+- `json_Img2Json.py`:    Collect test set image, and generate json file;
+- `json_Split.py`:       Divide the json file into train sets and val sets.
+- `json_Merge.py`:       Merge multiple json files into one.
+
+## 3 Usage Example
+
+### 3.1 Sample Dataset
+
+This document uses the COCO 2017 dataset as sample data to demonstrate. You can download the dataset at the following link:
+
+- [Official download link](https://cocodataset.org/#download)
+- [aistudio backup link](https://aistudio.baidu.com/aistudio/datasetdetail/7122)
+
+After the download is complete, you can copy or link the `coco_tools` directory from the PaddleRS project to the dataset directory for future use. The complete data set directory structure is as follows:
+
+```
+./COCO2017/      # dataset root directory
+|--train2017     # training dataset original image directory
+|  |--...
+|  |--...
+|--val2017       # validation dataset original image directory
+|  |--...
+|  |--...
+|--test2017      # test dataset original image directory
+|  |--...
+|  |--...
+|
+|--annotations   # annotation files directory
+|  |--...
+|  |--...
+|
+|--coco_tools    # coco_tools code directory
+|  |--...
+|  |--...
+```
+
+### 3.2 Printing json Information
+
+Using `json_InfoShow.py` allows you to print the keys of each key-value pair in a json file and output the first element in the value to help you quickly understand the annotation information. For COCO format annotation data, you should pay special attention to the contents of the `image` and `annotation` fields.
+
+#### 3.2.1 Command Demonstration
+
+Run the following command to print the information in `instances_val2017.json` :
+
+```
+python ./coco_tools/json_InfoShow.py \
+       --json_path=./annotations/instances_val2017.json \
+       --show_num 5
+```
+
+#### 3.2.2 Parameter Description
+
+
+| Parameter Name| Description                                                | Default value |
+| ------------- | -----------------------------------------------------------| ------------- |
+| `--json_path` | Path of the json file whose statistics are to be collected.|               |
+| `--show_num`  | (Optional) Number of top elements in the output value. | `5`           |
+| `--Args_show` | (Optional) Whether to print input parameter information.   | `True`        |
+
+#### 3.2.3 Result Presentation
+
+After the preceding command is executed, the following information will be displayed:
+
+```
+------------------------------------------------Args------------------------------------------------
+json_path = ./annotations/instances_val2017.json
+show_num = 5
+Args_show = True
+
+------------------------------------------------Info------------------------------------------------
+json read...
+json keys: dict_keys(['info', 'licenses', 'images', 'annotations', 'categories'])
+
+***********************info***********************
+ Content Type: dict
+ Total Length: 6
+ First 5 record:
+
+description : COCO 2017 Dataset
+url : http://cocodataset.org
+version : 1.0
+year : 2017
+contributor : COCO Consortium
+...
+...
+
+*********************licenses*********************
+ Content Type: list
+ Total Length: 8
+ First 5 record:
+
+{'url': 'http://creativecommons.org/licenses/by-nc-sa/2.0/', 'id': 1, 'name': 'Attribution-NonCommercial-ShareAlike License'}
+{'url': 'http://creativecommons.org/licenses/by-nc/2.0/', 'id': 2, 'name': 'Attribution-NonCommercial License'}
+{'url': 'http://creativecommons.org/licenses/by-nc-nd/2.0/', 'id': 3, 'name': 'Attribution-NonCommercial-NoDerivs License'}
+{'url': 'http://creativecommons.org/licenses/by/2.0/', 'id': 4, 'name': 'Attribution License'}
+{'url': 'http://creativecommons.org/licenses/by-sa/2.0/', 'id': 5, 'name': 'Attribution-ShareAlike License'}
+...
+...
+
+**********************images**********************
+ Content Type: list
+ Total Length: 5000
+ First 5 record:
+
+{'license': 4, 'file_name': '000000397133.jpg', 'coco_url': 'http://images.cocodataset.org/val2017/000000397133.jpg', 'height': 427, 'width': 640, 'date_captured': '2013-11-14 17:02:52', 'flickr_url': 'http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg', 'id': 397133}
+{'license': 1, 'file_name': '000000037777.jpg', 'coco_url': 'http://images.cocodataset.org/val2017/000000037777.jpg', 'height': 230, 'width': 352, 'date_captured': '2013-11-14 20:55:31', 'flickr_url': 'http://farm9.staticflickr.com/8429/7839199426_f6d48aa585_z.jpg', 'id': 37777}
+{'license': 4, 'file_name': '000000252219.jpg', 'coco_url': 'http://images.cocodataset.org/val2017/000000252219.jpg', 'height': 428, 'width': 640, 'date_captured': '2013-11-14 22:32:02', 'flickr_url': 'http://farm4.staticflickr.com/3446/3232237447_13d84bd0a1_z.jpg', 'id': 252219}
+{'license': 1, 'file_name': '000000087038.jpg', 'coco_url': 'http://images.cocodataset.org/val2017/000000087038.jpg', 'height': 480, 'width': 640, 'date_captured': '2013-11-14 23:11:37', 'flickr_url': 'http://farm8.staticflickr.com/7355/8825114508_b0fa4d7168_z.jpg', 'id': 87038}
+{'license': 6, 'file_name': '000000174482.jpg', 'coco_url': 'http://images.cocodataset.org/val2017/000000174482.jpg', 'height': 388, 'width': 640, 'date_captured': '2013-11-14 23:16:55', 'flickr_url': 'http://farm8.staticflickr.com/7020/6478877255_242f741dd1_z.jpg', 'id': 174482}
+...
+...
+
+*******************annotations********************
+ Content Type: list
+ Total Length: 36781
+ First 5 record:
+
+{'segmentation': [[510.66, 423.01, 511.72, 420.03, 510.45, 416.0, 510.34, 413.02, 510.77, 410.26, 510.77, 407.5, 510.34, 405.16, 511.51, 402.83, 511.41, 400.49, 510.24, 398.16, 509.39, 397.31, 504.61, 399.22, 502.17, 399.64, 500.89, 401.66, 500.47, 402.08, 499.09, 401.87, 495.79, 401.98, 490.59, 401.77, 488.79, 401.77, 485.39, 398.58, 483.9, 397.31, 481.56, 396.35, 478.48, 395.93, 476.68, 396.03, 475.4, 396.77, 473.92, 398.79, 473.28, 399.96, 473.49, 401.87, 474.56, 403.47, 473.07, 405.59, 473.39, 407.71, 476.68, 409.41, 479.23, 409.73, 481.56, 410.69, 480.4, 411.85, 481.35, 414.93, 479.86, 418.65, 477.32, 420.03, 476.04, 422.58, 479.02, 422.58, 480.29, 423.01, 483.79, 419.93, 486.66, 416.21, 490.06, 415.57, 492.18, 416.85, 491.65, 420.24, 492.82, 422.9, 493.56, 424.39, 496.43, 424.6, 498.02, 423.01, 498.13, 421.31, 497.07, 420.03, 497.07, 415.15, 496.33, 414.51, 501.1, 411.96, 502.06, 411.32, 503.02, 415.04, 503.33, 418.12, 501.1, 420.24, 498.98, 421.63, 500.47, 424.39, 505.03, 423.32, 506.2, 421.31, 507.69, 419.5, 506.31, 423.32, 510.03, 423.01, 510.45, 423.01]], 'area': 702.1057499999998, 'iscrowd': 0, 'image_id': 289343, 'bbox': [473.07, 395.93, 38.65, 28.67], 'category_id': 18, 'id': 1768}
+{'segmentation': [[289.74, 443.39, 302.29, 445.32, 308.09, 427.94, 310.02, 416.35, 304.23, 405.73, 300.14, 385.01, 298.23, 359.52, 295.04, 365.89, 282.3, 362.71, 275.29, 358.25, 277.2, 346.14, 280.39, 339.13, 284.85, 339.13, 291.22, 338.49, 293.77, 335.95, 295.04, 326.39, 297.59, 317.47, 289.94, 309.82, 287.4, 288.79, 286.12, 275.41, 284.21, 271.59, 279.11, 276.69, 275.93, 275.41, 272.1, 271.59, 274.01, 267.77, 275.93, 265.22, 277.84, 264.58, 282.3, 251.2, 293.77, 238.46, 307.79, 221.25, 314.79, 211.69, 325.63, 205.96, 338.37, 205.32, 347.29, 205.32, 353.03, 205.32, 361.31, 200.23, 367.95, 202.02, 372.27, 205.8, 382.52, 215.51, 388.46, 225.22, 399.25, 235.47, 399.25, 252.74, 390.08, 247.34, 386.84, 247.34, 388.46, 256.52, 397.09, 268.93, 413.28, 298.6, 421.91, 356.87, 424.07, 391.4, 422.99, 409.74, 420.29, 428.63, 415.43, 433.48, 407.88, 414.6, 405.72, 391.94, 401.41, 404.89, 394.39, 420.54, 391.69, 435.64, 391.15, 447.51, 387.38, 461.0, 384.68, 480.0, 354.47, 477.73, 363.1, 433.48, 370.65, 405.43, 369.03, 394.64, 361.48, 398.95, 355.54, 403.81, 351.77, 403.81, 343.68, 403.27, 339.36, 402.19, 335.58, 404.89, 333.42, 411.9, 332.34, 416.76, 333.42, 425.93, 334.5, 430.79, 336.12, 435.64, 321.01, 464.78, 316.16, 468.01, 307.53, 472.33, 297.28, 472.33, 290.26, 471.25, 285.94, 472.33, 283.79, 464.78, 280.01, 462.62, 284.33, 454.53, 285.94, 453.45, 282.71, 448.59, 288.64, 444.27, 291.88, 443.74]], 'area': 27718.476299999995, 'iscrowd': 0, 'image_id': 61471, 'bbox': [272.1, 200.23, 151.97, 279.77], 'category_id': 18, 'id': 1773}
+{'segmentation': [[147.76, 396.11, 158.48, 355.91, 153.12, 347.87, 137.04, 346.26, 125.25, 339.29, 124.71, 301.77, 139.18, 262.64, 159.55, 232.63, 185.82, 209.04, 226.01, 196.72, 244.77, 196.18, 251.74, 202.08, 275.33, 224.59, 283.9, 232.63, 295.16, 240.67, 315.53, 247.1, 327.85, 249.78, 338.57, 253.0, 354.12, 263.72, 379.31, 276.04, 395.39, 286.23, 424.33, 304.99, 454.95, 336.93, 479.62, 387.02, 491.58, 436.36, 494.57, 453.55, 497.56, 463.27, 493.08, 511.86, 487.02, 532.62, 470.4, 552.99, 401.26, 552.99, 399.65, 547.63, 407.15, 535.3, 389.46, 536.91, 374.46, 540.13, 356.23, 540.13, 354.09, 536.91, 341.23, 533.16, 340.15, 526.19, 342.83, 518.69, 355.7, 512.26, 360.52, 510.65, 374.46, 510.11, 375.53, 494.03, 369.1, 497.25, 361.06, 491.89, 361.59, 488.67, 354.63, 489.21, 346.05, 496.71, 343.37, 492.42, 335.33, 495.64, 333.19, 489.21, 327.83, 488.67, 323.0, 499.39, 312.82, 520.83, 304.24, 531.02, 291.91, 535.84, 273.69, 536.91, 269.4, 533.7, 261.36, 533.7, 256.0, 531.02, 254.93, 524.58, 268.33, 509.58, 277.98, 505.82, 287.09, 505.29, 301.56, 481.7, 302.1, 462.41, 294.06, 481.17, 289.77, 488.14, 277.98, 489.74, 261.36, 489.21, 254.93, 488.67, 254.93, 484.38, 244.75, 482.24, 247.96, 473.66, 260.83, 467.23, 276.37, 464.02, 283.34, 446.33, 285.48, 431.32, 287.63, 412.02, 277.98, 407.74, 260.29, 403.99, 257.61, 401.31, 255.47, 391.12, 233.8, 389.37, 220.18, 393.91, 210.65, 393.91, 199.76, 406.61, 187.51, 417.96, 178.43, 420.68, 167.99, 420.68, 163.45, 418.41, 158.01, 419.32, 148.47, 418.41, 145.3, 413.88, 146.66, 402.53]], 'area': 78969.31690000003, 'iscrowd': 0, 'image_id': 472375, 'bbox': [124.71, 196.18, 372.85, 356.81], 'category_id': 18, 'id': 2551}
+{'segmentation': [[260.4, 231.26, 215.06, 274.01, 194.33, 307.69, 195.63, 329.72, 168.42, 355.63, 120.49, 382.83, 112.71, 415.22, 159.35, 457.98, 172.31, 483.89, 229.31, 504.62, 275.95, 500.73, 288.91, 495.55, 344.62, 605.67, 395.14, 634.17, 480.0, 632.87, 480.0, 284.37, 404.21, 223.48, 336.84, 202.75, 269.47, 154.82, 218.95, 179.43, 203.4, 194.98, 190.45, 211.82, 233.2, 205.34]], 'area': 108316.66515000002, 'iscrowd': 0, 'image_id': 520301, 'bbox': [112.71, 154.82, 367.29, 479.35], 'category_id': 18, 'id': 3186}
+{'segmentation': [[200.61, 253.97, 273.19, 318.49, 302.43, 336.64, 357.87, 340.67, 402.23, 316.48, 470.78, 331.6, 521.19, 321.52, 583.69, 323.53, 598.81, 287.24, 600.83, 236.84, 584.7, 190.46, 580.66, 169.29, 531.27, 121.91, 472.8, 93.69, 420.38, 89.65, 340.74, 108.81, 295.37, 119.9, 263.11, 141.07, 233.88, 183.41, 213.72, 229.78, 200.61, 248.93]], 'area': 75864.53530000002, 'iscrowd': 0, 'image_id': 579321, 'bbox': [200.61, 89.65, 400.22, 251.02], 'category_id': 18, 'id': 3419}
+...
+...
+
+********************categories********************
+ Content Type: list
+ Total Length: 80
+ First 5 record:
+
+{'supercategory': 'person', 'id': 1, 'name': 'person'}
+{'supercategory': 'vehicle', 'id': 2, 'name': 'bicycle'}
+{'supercategory': 'vehicle', 'id': 3, 'name': 'car'}
+{'supercategory': 'vehicle', 'id': 4, 'name': 'motorcycle'}
+{'supercategory': 'vehicle', 'id': 5, 'name': 'airplane'}
+...
+...
+
+```
+
+#### 3.2.4 Result Description
+
+`instances_val2017.json` has 5 keys:
+
+```
+'info', 'licenses', 'images', 'annotations', 'categories'
+```
+Among them,
+
+- `info`: A dictionary. There are 6 key-value pairs, and the output shows the first five pairs;
+- `licenses`: A list with eight elements, and the output shows the first five;
+- `images`: A list with 5000 elements, and the output shows the first five;
+- `annotations`: A list with 36,781 elements, and the output shows the first five;
+- `categories`: A list of 80 elements, and the output shows the first five.
+
+### 3.3 Statistical Image Information
+
+Using `json_ImgSta.py`, you can quickly extract image information from `instances_val2017.json`, generate csv tables, and generate statistical graphs.
+
+#### 3.3.1 Command Demonstration
+
+Run the following command to print the `instances_val2017.json` information:
+
+```
+python ./coco_tools/json_ImgSta.py \
+    --json_path=./annotations/instances_val2017.json \
+    --csv_path=./img_sta/images.csv \
+    --png_shape_path=./img_sta/images_shape.png \
+    --png_shapeRate_path=./img_sta/images_shapeRate.png
+```
+
+#### 3.3.2 Parameter Description
+
+| Parameter Name         | Description                                                                | Default Value    |
+| ---------------------- | --------------------------------------------------------------------- | -------- |
+| `--json_path`          | Path of the json file whose statistics are to be collected.|          |
+| `--csv_path`           | (Optional) Path for the statistics table.| `None`   |
+| `--png_shape_path`     | (Optional) png image saving path. The image content is a two-dimensional distribution of all image shapes.                   | `5`      |
+| `--png_shapeRate_path` | (Optional) png image saving path. The image content is a one-dimensional distribution of shape ratio (width/height) of all images.           | `5`      |
+| `--image_keyname`      | (Optional) Key corresponding to the image in the json file.|`'images'`|
+| `--Args_show`          | (Optional) Whether to print input parameter information.       |`True`    |
+
+#### 3.3.3 Result Presentation
+
+After the preceding command is executed, the following information is displayed:
+
+```
+------------------------------------------------Args------------------------------------------------
+json_path = ./annotations/instances_val2017.json
+csv_path = ./img_sta/images.csv
+png_shape_path = ./img_sta/images_shape.png
+png_shapeRate_path = ./img_sta/images_shapeRate.png
+image_keyname = images
+Args_show = True
+
+json read...
+
+make dir: ./img_sta
+png save to ./img_sta/images_shape.png
+png save to ./img_sta/images_shapeRate.png
+csv save to ./img_sta/images.csv
+```
+
+Part of the table:
+
+
+|   | license | file_name        | coco_url                                               | height | width | date_captured       | flickr_url                                                     | id     | shape_rate |
+| --- | --------- | ------------------ | -------------------------------------------------------- | -------- | ------- | --------------------- | ---------------------------------------------------------------- | -------- | ------------ |
+| 0 | 4       | 000000397133.jpg | http://images.cocodataset.org/val2017/000000397133.jpg | 427    | 640   | 2013-11-14 17:02:52 | http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg | 397133 | 1.5        |
+| 1 | 1       | 000000037777.jpg | http://images.cocodataset.org/val2017/000000037777.jpg | 230    | 352   | 2013-11-14 20:55:31 | http://farm9.staticflickr.com/8429/7839199426_f6d48aa585_z.jpg | 37777  | 1.5        |
+| 2 | 4       | 000000252219.jpg | http://images.cocodataset.org/val2017/000000252219.jpg | 428    | 640   | 2013-11-14 22:32:02 | http://farm4.staticflickr.com/3446/3232237447_13d84bd0a1_z.jpg | 252219 | 1.5        |
+| 3 | 1       | 000000087038.jpg | http://images.cocodataset.org/val2017/000000087038.jpg | 480    | 640   | 2013-11-14 23:11:37 | http://farm8.staticflickr.com/7355/8825114508_b0fa4d7168_z.jpg | 87038  | 1.3        |
+
+Contents of saved pictures:
+
+Two-dimensional distribution of all image shapes:
+![image.png](./assets/1650011491220-image.png)
+
+One-dimensional distribution of shape ratio (width/height) of all images:
+![image.png](./assets/1650011634205-image.png)
+
+### 3.4 Collect Statistics about the Object Detection Label Box
+
+Using `json_AnnoSta.py`, you can quickly extract annotation information from `instances_val2017.json`, generate csv tables, and generate statistical graphs.
+
+
+#### 3.4.1 Command Demonstration
+
+Run the following command to print the `instances_val2017.json` information:
+
+```
+python ./coco_tools/json_AnnoSta.py \
+    --json_path=./annotations/instances_val2017.json \
+    --csv_path=./anno_sta/annos.csv \
+    --png_shape_path=./anno_sta/annos_shape.png \
+    --png_shapeRate_path=./anno_sta/annos_shapeRate.png \
+    --png_pos_path=./anno_sta/annos_pos.png \
+    --png_posEnd_path=./anno_sta/annos_posEnd.png \
+    --png_cat_path=./anno_sta/annos_cat.png \
+    --png_objNum_path=./anno_sta/annos_objNum.png \
+    --get_relative=True
+```
+
+#### 3.4.2 Parameter Description
+
+| Parameter Name                 | Description                                                                                                       | Default Value         |
+| ---------------------- | ------------------------------------------------------------------------------------------------------------------------- | ------------- |
+| `--json_path`          | (Optional) Path of the json file whose statistics you want to collect.                                                    |               |
+| `--csv_path`           | (Optional) Save path for the statistics table.                                                                            | `None`        |
+| `--png_shape_path`     | (Optional) png image saving path. The image content is the two-dimensional distribution of the shape of all target detection frames.| `None`        |
+| `--png_shapeRate_path` | (Optional) png image saving path. The image content is a one-dimensional distribution of shape ratio (width/height) of all target detection boxes.| `None`        |
+| `--png_pos_path`       | (Optional) png image saving path. The image content is the two-dimensional distribution of the coordinates in the upper left corner of all target detection boxes. | `None`        |
+| `--png_posEnd_path`    | (Optional) png image saving path. The image content is the two-dimensional distribution of the coordinates at the lower right corner of all target detection boxes.| `None`        |
+| `--png_cat_path`       | (Optional) png image saving path. The image content is the quantity distribution of objects in each category.                                                      | `None`        |
+| `--png_objNum_path`    | (Optional) png image saving path. The image content is the quantity distribution of annotated objects in a single image.                                           | `None`        |
+| `--get_relative`       | (Optional) Whether to generate the shape of the image target detection frame and the relative ratio of the coordinates of the upper left corner and lower right corner of the object detection frame (horizontal axis coordinates/image length, vertical axis coordinates/image width).| `None`        |
+| `--image_keyname`      | (Optional) Key corresponding to the image in the json file                                                                                                     | `'images'`    |
+| `--anno_keyname`       | (Optional) Annotate the corresponding key in the json file                                                                                                         | `'annotations'`|
+| `--Args_show`          | (Optional) Whether to print input parameter information                                                                                                            | `True`        |
+
+#### 3.4.3 Result Presentation
+
+After the preceding command is executed, the following information is displayed:
+
+```
+------------------------------------------------Args------------------------------------------------
+json_path = ./annotations/instances_val2017.json
+csv_path = ./anno_sta/annos.csv
+png_shape_path = ./anno_sta/annos_shape.png
+png_shapeRate_path = ./anno_sta/annos_shapeRate.png
+png_pos_path = ./anno_sta/annos_pos.png
+png_posEnd_path = ./anno_sta/annos_posEnd.png
+png_cat_path = ./anno_sta/annos_cat.png
+png_objNum_path = ./anno_sta/annos_objNum.png
+get_relative = True
+image_keyname = images
+anno_keyname = annotations
+Args_show = True
+
+json read...
+
+make dir: ./anno_sta
+png save to ./anno_sta/annos_shape.png
+png save to ./anno_sta/annos_shape_Relative.png
+png save to ./anno_sta/annos_shapeRate.png
+png save to ./anno_sta/annos_pos.png
+png save to ./anno_sta/annos_pos_Relative.png
+png save to ./anno_sta/annos_posEnd.png
+png save to ./anno_sta/annos_posEnd_Relative.png
+png save to ./anno_sta/annos_cat.png
+png save to ./anno_sta/annos_objNum.png
+csv save to ./anno_sta/annos.csv
+```
+
+Part of the table:
+
+![image.png](./assets/1650025881244-image.png)
+
+Two-dimensional distribution of shape of all object detection boxes:
+
+![image.png](./assets/1650025909461-image.png)
+
+Two-dimensional distribution of relative proportions of all object detection box shapes in the image:
+
+![image.png](./assets/1650026052596-image.png)
+
+One-dimensional distribution of shape ratio (width/height) of all object detection boxes:
+
+![image.png](./assets/1650026072233-image.png)
+
+Two-dimensional distribution of coordinates in the upper left corner of all object detection boxes:
+
+![image.png](./assets/1650026247150-image.png)
+
+Two-dimensional distribution of relative proportional values of coordinates at the upper left corner of all object detection boxes:
+
+![image.png](./assets/1650026289987-image.png)
+
+Two-dimensional distribution of coordinates at the lower right corner of all object detection boxes:
+
+![image.png](./assets/1650026457254-image.png)
+
+Two-dimensional distribution of relative proportional values of coordinates at the lower right corner of all object detection boxes:
+
+![image.png](./assets/1650026487732-image.png)
+
+Distribution of the number of objects in each category:
+
+![image.png](./assets/1650026546304-image.png)
+
+A single image contains the quantity distribution of annotated objects:
+
+![image.png](./assets/1650026559309-image.png)
+
+### 3.5 Stat Image Information to Generates json
+
+Using `json_Test2Json.py`, you can quickly extract image information according to the file information in `test2017` and the training set json file, and generate the test set json file.
+
+#### 3.5.1 Command Demonstration
+
+Run the following command to collect and generate `test2017` information:
+
+```
+python ./coco_tools/json_Img2Json.py \
+    --test_image_path=./test2017 \
+    --json_train_path=./annotations/instances_val2017.json \
+    --json_test_path=./test.json
+```
+
+#### 3.5.2 Parameter Description
+
+
+| Parameter Name      | Description                                                       | Default Value        |
+| ------------------- | ----------------------------------------------------------------- | ------------ |
+| `--test_image_path` | The image directory path that need to count                       |              |
+| `--json_train_path` | json file path of the training set for reference                  |              |
+| `--json_test_path`  | Path to the generated json file of the test set                   |              |
+| `--image_keyname`   | (Optional) Key corresponding to the image in the json file    | `'images'`    |
+| `--cat_keyname`     | (Optional) Key corresponding to the category in the json file     | `'categories'`|
+| `--Args_show`       | (Optional) Whether to print input parameter information           | `True`        |
+
+#### 3.5.3 Result Presentation
+
+After the preceding command is executed, the following information is displayed:
+
+```
+------------------------------------------------Args------------------------------------------------
+test_image_path = ./test2017
+json_train_path = ./annotations/instances_val2017.json
+json_test_path = ./test.json
+Args_show = True
+
+----------------------------------------------Get Test----------------------------------------------
+
+json read...
+
+test image read...
+100%|█████████████████████████████████████| 40670/40670 [06:48<00:00, 99.62it/s]
+
+ total test image: 40670
+```
+
+The generated json file information:
+
+```
+------------------------------------------------Args------------------------------------------------
+json_path = ./test.json
+show_num = 5
+Args_show = True
+
+------------------------------------------------Info------------------------------------------------
+json read...
+json keys: dict_keys(['images', 'categories'])
+
+**********************images**********************
+ Content Type: list
+ Total Length: 40670
+ First 5 record:
+
+{'id': 0, 'width': 640, 'height': 427, 'file_name': '000000379269.jpg'}
+{'id': 1, 'width': 640, 'height': 360, 'file_name': '000000086462.jpg'}
+{'id': 2, 'width': 640, 'height': 427, 'file_name': '000000176710.jpg'}
+{'id': 3, 'width': 640, 'height': 426, 'file_name': '000000071106.jpg'}
+{'id': 4, 'width': 596, 'height': 640, 'file_name': '000000251918.jpg'}
+...
+...
+
+********************categories********************
+ Content Type: list
+ Total Length: 80
+ First 5 record:
+
+{'supercategory': 'person', 'id': 1, 'name': 'person'}
+{'supercategory': 'vehicle', 'id': 2, 'name': 'bicycle'}
+{'supercategory': 'vehicle', 'id': 3, 'name': 'car'}
+{'supercategory': 'vehicle', 'id': 4, 'name': 'motorcycle'}
+{'supercategory': 'vehicle', 'id': 5, 'name': 'airplane'}
+...
+...
+```
+
+### 3.6 json File Splitting
+
+Using `json_Split.py`, you can split the `instances_val2017.json` file into 2 subsets.
+
+#### 3.6.1 Command Demonstration
+
+Run the following command to split the `instances_val2017.json` file:
+
+```
+python ./coco_tools/json_Split.py \
+    --json_all_path=./annotations/instances_val2017.json \
+    --json_train_path=./instances_val2017_train.json \
+    --json_val_path=./instances_val2017_val.json
+```
+
+#### 3.6.2 Parameter Description
+
+
+| Parameter Name       | Description                                                                                                               | Default Value |
+| -------------------- | ------------------------------------------------------------------------------------------------------------------------- | ------------  |
+| `--json_all_path`    | Path to the json file that need to split                                                                                  |               |
+| `--json_train_path`  | Generated json file for the train set                                                                                 |               |
+| `--json_val_path`    | Generated json file for the val set                                                                                   |               |
+| `--val_split_rate`   | (Optional) Proportion of files in the val set during the split                                                        | `0.1`         |
+| `--val_split_num`    | (Optional) Number of val set files during the split. If this parameter is set, the `--val_split_rate` parameter is invalid| `None`        |
+| `--keep_val_inTrain` | (Optional) Whether to keep the val part in the train during the split                                                     | `False`       |
+| `--image_keyname`    | (Optional) Key corresponding to the image in the json file                                                            | `'images'`    |
+| `--cat_keyname`      | (Optional) Key corresponding to the category in the json file                                                         | `'categories'`|
+| `--Args_show`        | (Optional) Whether to print input parameter information                                                                   | `'True'`      |
+
+#### 3.6.3 Result Presentation
+
+After the preceding command is executed, the following information is displayed:
+
+```
+------------------------------------------------Args------------------------------------------------
+json_all_path = ./annotations/instances_val2017.json
+json_train_path = ./instances_val2017_train.json
+json_val_path = ./instances_val2017_val.json
+val_split_rate = 0.1
+val_split_num = None
+keep_val_inTrain = False
+image_keyname = images
+anno_keyname = annotations
+Args_show = True
+
+-----------------------------------------------Split------------------------------------------------
+
+json read...
+
+image total 5000, train 4500, val 500
+anno total 36781, train 33119, val 3662
+```
+
+### 3.7 json File Merging
+
+Using `json_Merge.py` to merge two json files.
+
+#### 3.7.1 Command Demonstration
+
+Run the following command to merge `instances_train2017.json` and `instances_val2017.json`:
+
+```
+python ./coco_tools/json_Merge.py \
+    --json1_path=./annotations/instances_train2017.json \
+    --json2_path=./annotations/instances_val2017.json \
+    --save_path=./instances_trainval2017.json
+```
+
+#### 3.7.2 Parameter Description
+
+
+| Parameter Name | Description                                                      | Default Value                       |
+| -------------- | ---------------------------------------------------------------- | --------------------------- |
+| `--json1_path` | json file1 path to merge                                         |                             |
+| `--json2_path` | json file2 path to merge                                         |                             |
+| `--save_path`  | Generated json file                                              |                             |
+| `--merge_keys` | (Optional) Keys to be merged during the merging process      | `['images', 'annotations']` |
+| `--Args_show`  | (Optional) Whether to print input parameter information          | `True`                      |
+
+#### 3.7.3 Result Presentation
+
+After the preceding command is executed, the following information is displayed:
+
+```
+------------------------------------------------Args------------------------------------------------
+json1_path = ./annotations/instances_train2017.json
+json2_path = ./annotations/instances_val2017.json
+save_path = ./instances_trainval2017.json
+merge_keys = ['images', 'annotations']
+Args_show = True
+
+-----------------------------------------------Merge------------------------------------------------
+
+json read...
+
+json merge...
+info
+licenses
+images merge!
+annotations merge!
+categories
+
+json save...
+
+finish!
+```

+ 19 - 0
docs/data/dataset_en.md

@@ -0,0 +1,19 @@
+# Remote Sensing Open Source Dataset
+
+PaddleRS has collected and summarized the most commonly used **open source** deep learning data sets in the field of remote sensing, providing the following information for each data set: dataset description, image information, annotation information, source address, and AI Studio backup link. According to the task type, these data sets can be divided into **image classification, image segmentation, change detection, object detection, object tracking, multi-label classification, image generation** and other types. Currently, the collected remote sensing datasets include:
+
+* 32 image classification datasets;
+* 40 object detection datasets;
+* 70 image segmentation datasets;
+* 28 change detection datasets;
+* 7 instance segmentation datasets;
+* 3 multi-label classification datasets;
+* 9 object tracking datasets;
+* 3 image caption datasets;
+* 8 image generation datasets.
+
+Visit [Remote Sensing Data Set Summary](./dataset_summary.md) for more information.
+
+## Dataset of Special Contribution
+
+* Sample data of typical urban roads in China of 2020(CHN6-CUG), provided by Professor [Qiqi Zhu](http://grzy.cug.edu.cn/zhuqiqi), China University of Geosciences. Please refer to [here](http://grzy.cug.edu.cn/zhuqiqi/zh_CN/yjgk/32368/content/1733.htm) for more information and download information.

+ 218 - 0
docs/data/dataset_summary_en.md

@@ -0,0 +1,218 @@
+| <span style="white-space:nowrap;">Index<br>(aistudio<br>link)</span> | <span style="white-space:nowrap;">&emsp;Dataset Name&emsp;<br>(source link)</span>| <span style="white-space:nowrap;">Task Type</span> | <span style="white-space:nowarp;">Image Size</span> | <span style="white-space:nowrap;">Image<br>Channels</span> | <span style="white-space:nowrap;">Image<br>Number</span> | <span style="white-space:nowrap;">Label<br>Category</span> | <span style="white-space:nowrap;">Image<br>Format</span> | <span style="white-space:nowrap;">Label<br>Format</span>  | <span style="white-space:nowrap;">Spatial<br>Resolution</span> | <span style="white-space:nowrap;">Spectral<br>Resolution&emsp;</span> | <span style="white-space:nowrap;">Image Type</span> | <span style="white-space:nowrap;">Image Source</span> | <span style="white-space:nowrap;">Release Time</span> | <span style="white-space:nowrap;">Release Agency</span> | <span style="white-space:nowrap;">Source Link</span> | <span style="white-space:nowrap;">Aistudio Link</span> |
+| ------------------------------------------------------------ | ------------------------------------------------------------ | ---------- | -------------------------------------- | --------- | -------- | ------ | -------- | --------------- | ------------------ | ------------ | ------------------ | ---------------------------------------------------- | -------- | --------------------------------------------------------- | ------------------------------------------------------------ | -------------------------------------------------------- |
+| [1-1](https://aistudio.baidu.com/aistudio/datasetdetail/51628) | [UCMerced   LandUse](http://weegee.vision.ucmerced.edu/datasets/landuse.html) | Image Classification   | 256 * 256                              | 3         | 2100     | 21     | tif      | folder name     | 0.3m               | __           | Satellite image           | USGS National Map                                    | 2010     | University of California, Merced                          | http://weegee.vision.ucmerced.edu/datasets/landuse.html      | https://aistudio.baidu.com/aistudio/datasetdetail/51628  |
+| [1-2](https://aistudio.baidu.com/aistudio/datasetdetail/51733) | [WHU RS19](http://captain.whu.edu.cn/repository.html)        | Image Classification   | 600 * 600                              | 3         | 1005     | 19     | jpg      | folder name     | max 0.5m           | __           | Satellite image           | GoogleEarth                                          | 2012     | Wuhan University                                                  | http://captain.whu.edu.cn/repository.html                    | https://aistudio.baidu.com/aistudio/datasetdetail/51733  |
+| [1-3](https://aistudio.baidu.com/aistudio/datasetdetail/52117) | [RSSCN7](https://sites.google.com/site/qinzoucn/documents)   | Image Classification   | 400 * 400                              | 3         | 2800     | 7      | jpg      | folder name     | __                 | __           | Satellite image           | GoogleEarth                                          | 2015     | Wuhan University                                                  | https://sites.google.com/site/qinzoucn/documents             | https://aistudio.baidu.com/aistudio/datasetdetail/52117  |
+| [1-4](https://aistudio.baidu.com/aistudio/datasetdetail/52227) | [RS_C11](https://www.researchgate.net/publication/271647282_RS_C11_Database/comments) | Image Classification   | 512 * 512                              | 3         | 1232     | 11     | tif      | folder name     | 0.2m               | __           | Satellite image           | GoogleEarth                                          | 2016     | Chinese Academy of Sciences                                                    | https://www.researchgate.net/publication/271647282_RS_C11_Database/comments | https://aistudio.baidu.com/aistudio/datasetdetail/52227  |
+| [1-5](https://aistudio.baidu.com/aistudio/datasetdetail/51873) | [NWPU RESISC45](https://gcheng-nwpu.github.io/#Datasets)     | Image Classification   | 256 * 256                              | 3         | 31500    | 45     | jpg      | folder name     | 0.2~ 30m           | __           | Satellite image           | GoogleEarth                                          | 2016     | Northwestern Polytechnical University                                              | https://gcheng-nwpu.github.io/#Datasets                      | https://aistudio.baidu.com/aistudio/datasetdetail/51873  |
+| [1-6](https://aistudio.baidu.com/aistudio/datasetdetail/52025) | [AID](https://captain-whu.github.io/AID/)                    | Image Classification   | 600 * 600                              | 3         | 10000    | 30     | jpg      | folder name     | 0.5~ 8m            | __           | Satellite image           | GoogleEarth                                          | 2017     | Wuhan University                                                  | https://captain-whu.github.io/AID/                           | https://aistudio.baidu.com/aistudio/datasetdetail/52025  |
+| [1-7](https://aistudio.baidu.com/aistudio/datasetdetail/52359) | [RSD46 WHU](https://github.com/RSIA-LIESMARS-WHU/RSD46-WHU)  | Image Classification   | 256 * 256                              | 3         | 117000   | 46     | png      | folder name     | 0.5~ 2m            | __           | Satellite image           | GoogleEarth, Tianditu                                | 2017     | Wuhan University                                                  | https://github.com/RSIA-LIESMARS-WHU/RSD46-WHU               | https://aistudio.baidu.com/aistudio/datasetdetail/52359  |
+| [1-8](https://aistudio.baidu.com/aistudio/datasetdetail/55324) | [GID](https://x-ytong.github.io/project/GID.html)            | Image Classification   | 56 * 56                                | 3、4      | 30000    | 15     | tif      | folder name     | __                 | __           | Satellite image           | GF2                                                  | 2018     | Wuhan University                                                  | https://x-ytong.github.io/project/GID.html                   | https://aistudio.baidu.com/aistudio/datasetdetail/55324  |
+| [1-9](https://aistudio.baidu.com/aistudio/datasetdetail/52411) | [PatternNet](https://sites.google.com/view/zhouwx/dataset#h.p_Tgef10WTuEFr) | Image Classification   | 256 * 256                              | 3         | 30400    | 38     | jpg      | folder name     | 0.062~ 4.693m      | __           | Satellite image           | GoogleMap                                            | 2018     | Wuhan University                                                  | https://sites.google.com/view/zhouwx/dataset#h.p_Tgef10WTuEFr | https://aistudio.baidu.com/aistudio/datasetdetail/52411  |
+| [1-10](https://aistudio.baidu.com/aistudio/datasetdetail/88155) | [Satellite   Images of Hurricane Damage](https://ieee-dataport.org/open-access/detecting-damaged-buildings-post-hurricane-satellite-imagery-based-customized) | Image Classification   | 128 * 128                              | 3         | 23000    | 2      | jpeg     | folder name     | __                 | __           | Satellite image           | GeoEye1 etc                                            | 2018     | University of Washington                                  | https://ieee-dataport.org/open-access/detecting-damaged-buildings-post-hurricane-satellite-imagery-based-customized | https://aistudio.baidu.com/aistudio/datasetdetail/88155  |
+| [1-11](https://aistudio.baidu.com/aistudio/datasetdetail/88597) | [How to make   high resolution remote sensing image dataset](https://blog.csdn.net/u012193416/article/details/79472533) | Image Classification   | 256 * 256                              | 3         | 533      | 5      | jpg      | folder name     | 2.38866m           | __           | Satellite image           | GoogleEarth                                          | 2018     | __                                                        | https://blog.csdn.net/u012193416/article/details/79472533    | https://aistudio.baidu.com/aistudio/datasetdetail/88597  |
+| [1-12](https://aistudio.baidu.com/aistudio/datasetdetail/52650) | [EuroSAT](https://github.com/phelber/eurosat)                | Image Classification   | 64 * 64                                | 3、13     | 27000    | 10     | jpg, tif | folder name     | 10m                | __           | Satellite image           | Sentinel2                                            | 2018     | University of Kaiserslautern, Germany                                        | https://github.com/phelber/eurosat                           | https://aistudio.baidu.com/aistudio/datasetdetail/52650  |
+| [1-13](https://aistudio.baidu.com/aistudio/datasetdetail/135146) | [HistAerial   Dataset](http://eidolon.univ-lyon2.fr/~remi1/HistAerialDataset/) | Image Classification   | 25 * 25, 50 * 50 , 100 * 100           | 1         | 42000    | 7      | png      | folder name     | __                 | __           | Aerial image           | Aerial image                                             | 2019     | Univ Lyon                                                 | http://eidolon.univ-lyon2.fr/~remi1/HistAerialDataset/       | https://aistudio.baidu.com/aistudio/datasetdetail/135146 |
+| [1-14](https://aistudio.baidu.com/aistudio/datasetdetail/51798) | [OPTIMAL 31](http://crabwq.github.io/)                       | Image Classification   | 256 * 256                              | 3         | 1860     | 31     | jpg      | folder name     | __                 | __           | Satellite image           | GoogleEarth                                          | 2019     | Northwestern Polytechnical University                                              | http://crabwq.github.io/                                     | https://aistudio.baidu.com/aistudio/datasetdetail/51798  |
+| [1-15](https://aistudio.baidu.com/aistudio/datasetdetail/76927) | [WiDSDatathon2019](https://www.kaggle.com/c/widsdatathon2019/data) | Image Classification   | 256 * 256                              | 3         | 11000    | 2      | jpg      | csv             | 3m                 | __           | Satellite image           | Planet                                               | 2019     | Stanford                                                  | https://www.kaggle.com/c/widsdatathon2019/data               | https://aistudio.baidu.com/aistudio/datasetdetail/76927  |
+| [1-16](https://aistudio.baidu.com/aistudio/datasetdetail/76417) | [CLRS](https://github.com/lehaifeng/CLRS)                    | Image Classification   | 256 * 256                              | 3         | 15000    | 25     | tif      | folder name     | 0.26~ 8.85m        | __           | Satellite image           | GoogleEarth, BingMap, GoogleMap, Tianditu            | 2020     | Central South University                                                  | https://github.com/lehaifeng/CLRS                            | https://aistudio.baidu.com/aistudio/datasetdetail/76417  |
+| [1-17](https://aistudio.baidu.com/aistudio/datasetdetail/52728) | [SenseEarth   Classify](https://rs.sensetime.com/competition/index.html#/info) | Image Classification   | 100 * 100~12655 * 12655                | 3         | 70000    | 51     | jpg      | txt             | 0.2~ 153m          | __           | Satellite image           | GoogleEarth                                          | 2020     | Sensetime                                                  | https://rs.sensetime.com/competition/index.html#/info        | https://aistudio.baidu.com/aistudio/datasetdetail/52728  |
+| [1-18](https://aistudio.baidu.com/aistudio/datasetdetail/86229) | [TG1HRSSC](http://www.msadc.cn/main/setsubDetail?id=1369487569196158978) | Image Classification   | 512 * 512                              | 1、54、52 | 204      | 9      | tif      | folder name     | 5m, 10m, 20m,      | 0.4~ 2.5μm   | Satellite image           | Tiangong-1                                             | 2021     | Engineering and Technology Center for Space Applications, Chinese Academy of Sciences                          | http://www.msadc.cn/main/setsubDetail?id=1369487569196158978 | https://aistudio.baidu.com/aistudio/datasetdetail/86229  |
+| [1-19](https://aistudio.baidu.com/aistudio/datasetdetail/86451) | [NaSC TG2](http://www.msadc.cn/main/setsubDetail?id=1370312964720037889) | Image Classification   | 128 * 128                              | 3、14     | 20000    | 10     | jpg, tif | folder name     | __                 | 0.40~ 1.04µm | Satellite image           | Tiangong-2                                             | 2021     | Engineering and Technology Center for Space Applications, Chinese Academy of Sciences                          | http://www.msadc.cn/main/setsubDetail?id=1370312964720037889 | https://aistudio.baidu.com/aistudio/datasetdetail/86451  |
+| [1-20](https://aistudio.baidu.com/aistudio/datasetdetail/139361) | [S2UC Dataset](https://www.scidb.cn/en/detail?dataSetId=14e27d8c51ec40079b84591e9bb24df6) | Image Classification   | 224 * 224                              | 3         | 1714     | 2      | npy      | npy             | __                 | __           | Satellite image           | GoogleEarth                                          | 2021     | __                                                        | https://www.scidb.cn/en/detail?dataSetId=14e27d8c51ec40079b84591e9bb24df6 | https://aistudio.baidu.com/aistudio/datasetdetail/139361 |
+| [1-21](https://aistudio.baidu.com/aistudio/datasetdetail/52534) | [SAT 4](http://csc.lsu.edu/~saikat/deepsat/)                 | Image Classification   | 28 * 28                                | 4         | 500000   | 4      | mat      | mat             | 1~ 6m              | __           | Satellite image           | NAIPdataset                                          | 2015     | Louisiana State University                                        | http://csc.lsu.edu/~saikat/deepsat/                          | https://aistudio.baidu.com/aistudio/datasetdetail/52534  |
+| [1-22](https://aistudio.baidu.com/aistudio/datasetdetail/52534) | [SAT 6](http://csc.lsu.edu/~saikat/deepsat/)                 | Image Classification   | 28 * 28                                | 4         | 405000   | 6      | mat      | mat             | 1~ 6m              | __           | Satellite image           | NAIPdataset                                          | 2015     | Louisiana State University                                        | http://csc.lsu.edu/~saikat/deepsat/                          | https://aistudio.baidu.com/aistudio/datasetdetail/52534  |
+| [1-23](https://aistudio.baidu.com/aistudio/datasetdetail/51921) | [SIRI WHU   google](http://www.lmars.whu.edu.cn/prof_web/zhongyanfei/e-code.html) | Image Classification   | 200 * 200                              | 3         | 3400     | 12     | tif, dat | folder name     | 2m                 | __           | Satellite image           | GoogleEarth                                          | 2016     | Wuhan University                                                  | http://www.lmars.whu.edu.cn/prof_web/zhongyanfei/e-code.html | https://aistudio.baidu.com/aistudio/datasetdetail/51921  |
+| [1-24](https://aistudio.baidu.com/aistudio/datasetdetail/51921) | [SIRI WHU USGS](http://www.lmars.whu.edu.cn/prof_web/zhongyanfei/e-code.html) | Image Classification   | 10000 * 9000                           | 3         | 1        | 4      | tif      | folder name     | 2foot              | __           | Satellite image           | USGS                                                 | 2016     | Wuhan University                                                  | http://www.lmars.whu.edu.cn/prof_web/zhongyanfei/e-code.html | https://aistudio.baidu.com/aistudio/datasetdetail/51921  |
+| [1-25](https://aistudio.baidu.com/aistudio/datasetdetail/52487) | [RSI CB128](https://github.com/lehaifeng/RSI-CB)             | Image Classification   | 256 * 256                              | 3         | 36000    | 45     | tif      | folder name     | 0.3~ 3m            | __           | Satellite image           | GoogleEarth, BingMap                                 | 2017     | Central South University                                                  | https://github.com/lehaifeng/RSI-CB                          | https://aistudio.baidu.com/aistudio/datasetdetail/52487  |
+| [1-26](https://aistudio.baidu.com/aistudio/datasetdetail/52487) | [RSI CB256](https://github.com/lehaifeng/RSI-CB)             | Image Classification   | 256 * 256                              | 3         | 24000    | 35     | tif      | folder name     | 0.3~ 3m            | __           | Satellite image           | GoogleEarth, BingMap                                 | 2017     | Central South University                                                  | https://github.com/lehaifeng/RSI-CB                          | https://aistudio.baidu.com/aistudio/datasetdetail/52487  |
+| [1-27](https://aistudio.baidu.com/aistudio/datasetdetail/58013) | [Multi View   Datasets CV BrCT](http://www.patreo.dcc.ufmg.br/2020/07/22/multi-view-datasets/) | Image Classification   | 500 * 475                              | 3         | 48342    | 8      | png, tif | folder name     | __                 | __           | Aerial image、Satellite image | Aerial image, Satellite image                                   | 2020     | Federal University of Minas Gerais                        | http://www.patreo.dcc.ufmg.br/2020/07/22/multi-view-datasets/ | https://aistudio.baidu.com/aistudio/datasetdetail/58013  |
+| [1-28](https://aistudio.baidu.com/aistudio/datasetdetail/58760) | [Multi View   Datasets AiRound](http://www.patreo.dcc.ufmg.br/2020/07/22/multi-view-datasets/) | Image Classification   | 500 * 475                              | 3         | 13980    | 11     | tif      | folder name     | __                 | __           | Satellite image           | Sentinel2 etc                                          | 2020     | Federal University of Minas Gerais                        | http://www.patreo.dcc.ufmg.br/2020/07/22/multi-view-datasets/ | https://aistudio.baidu.com/aistudio/datasetdetail/58760  |
+| 1-29                                                         | [braziliancoffeescenes](http://patreo.dcc.ufmg.br/2017/11/12/brazilian-coffee-scenes-dataset) | Image Classification   | 64 * 64                                | 3         | 2876     | __     | __       | __              | __                 | __           | Satellite image           | SPOTsensor                                           | 2015     | Federal University of Minas                                            | http://patreo.dcc.ufmg.br/2017/11/12/brazilian-coffee-scenes-dataset |                                                          |
+| 1-30                                                         | [Planet:   Understanding the Amazon from Space](https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/data) | Image Classification   | 256 * 256                              | 3、4      | 40480    | __     | __       | __              | 3m                 | __           | Satellite image           | plantsensor                                          | 2017     | Planet                                                    | https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/data |                                                          |
+| 1-31                                                         | [rscupcls](http://rscup.bjxintong.com.cn/#/theme/1)          | Image Classification   | 512 * 512                              | 3         | 197121   | 45     | jpg      | folder name     | __                 | __           | __                 | __                                                   | 2019     | __                                                        | http://rscup.bjxintong.com.cn/#/theme/1                      |                                                          |
+| [1-32](https://aistudio.baidu.com/aistudio/datasetdetail/78849) | [MSTAR](https://www.kaggle.com/atreyamajumdar/mstar-dataset-8-classes) | Image Classification   | 368 * 368                              | 1         | 9466     | 8      | jpg      | folder name     | 0.3m               | __           | SAR                | STARLOSSAR                                           | 1996     | Defense Advanced Research Projects Agency                 | https://www.kaggle.com/atreyamajumdar/mstar-dataset-8-classes | https://aistudio.baidu.com/aistudio/datasetdetail/78849  |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+| [2-1](https://aistudio.baidu.com/aistudio/datasetdetail/53806) | [TAS](http://ai.stanford.edu/~gaheitz/Research/TAS/)         | Object Detection   | 792 * 636                              | 3         | 30       | 1      | .m       | HBB             | __                 | __           | Satellite image           | GoogleEarth                                          | 2008     | Stanford University                                                | http://ai.stanford.edu/~gaheitz/Research/TAS/                | https://aistudio.baidu.com/aistudio/datasetdetail/53806  |
+| [2-2](https://aistudio.baidu.com/aistudio/datasetdetail/53461) | [OIRDS](https://sourceforge.net/projects/oirds/)             | Object Detection   | 256 * 256~640 * 640                    | 3         | 900      | 5      | tif      | OBB             | 0.15m              | __           | Satellite image           | USGS, DARPA, VIVID                                   | 2009     | Raytheon Company                                                  | https://sourceforge.net/projects/oirds/                      | https://aistudio.baidu.com/aistudio/datasetdetail/53461  |
+| [2-3](https://aistudio.baidu.com/aistudio/datasetdetail/77674) | [SZTAKI INRIA   Building Detection Benchmark](http://web.eee.sztaki.hu/remotesensing/building_benchmark.html) | Object Detection   | 700± * 700±                            | 3         | 9        | 1      | jpg      | OBB             | __                 | __           | Aerial image、Satellite image | Aerial image, Satellite image                                   | 2012     | MTASZTAKI                                                 | http://web.eee.sztaki.hu/remotesensing/building_benchmark.html | https://aistudio.baidu.com/aistudio/datasetdetail/77674  |
+| [2-4](https://aistudio.baidu.com/aistudio/datasetdetail/53318) | [UCAS_AOD](https://onedrive.hyper.ai/home/UCAS-AOD)          | Object Detection   | 1000± * 1000±                          | 3         | 976      | 2      | png      | OBB             | __                 | __           | Aerial image           | GoogleEarth                                          | 2014     | Chinese Academy of Sciences                                                    | https://onedrive.hyper.ai/home/UCAS-AOD                      | https://aistudio.baidu.com/aistudio/datasetdetail/53318  |
+| [2-5](https://aistudio.baidu.com/aistudio/datasetdetail/52812) | [NWPUVHR 10](https://gcheng-nwpu.github.io/#Datasets)        | Object Detection   | 500 * 500~1100 * 1100                  | 3         | 1510     | 10     | jpg      | HBB             | 0.08~ 2m           | __           | Satellite image           | GoogleEarth, Vaihingen                               | 2014     | Northwestern Polytechnical University                                              | https://gcheng-nwpu.github.io/#Datasets                      | https://aistudio.baidu.com/aistudio/datasetdetail/52812  |
+| [2-6](https://aistudio.baidu.com/aistudio/datasetdetail/53383) | [VEDAI](https://downloads.greyc.fr/vedai/)                   | Object Detection   | 512 * 512~1024 * 1024                  | 4         | 1210     | 9      | png      | OBB             | 0.125m             | __           | Satellite image           | UtahAGRC                                             | 2015     | University of Caen                                                 | https://downloads.greyc.fr/vedai/                            | https://aistudio.baidu.com/aistudio/datasetdetail/53383  |
+| [2-7](https://aistudio.baidu.com/aistudio/datasetdetail/54106) | [HRSC2016](https://sites.google.com/site/hrsc2016/)          | Object Detection   | 1100± * 1100±                          | 3         | 1061     | 27     | png      | OBB             | __                 | __           | Satellite image           | GoogleEarth                                          | 2016     | Northwestern Polytechnical University                                              | https://sites.google.com/site/hrsc2016/                      | https://aistudio.baidu.com/aistudio/datasetdetail/54106  |
+| [2-8](https://aistudio.baidu.com/aistudio/datasetdetail/54185) | [DLR3k](https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-52777) | Object Detection   | 5616 * 3744                            | 3         | 14235    | 7      | jpg      | OBB             | 0.13m              | __           | Aerial image           | Aerial image (CanonEos1DsMarkIII)                        | 2016     | German Aerospace Center                                          | https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-52777 | https://aistudio.baidu.com/aistudio/datasetdetail/54185  |
+| [2-9](https://aistudio.baidu.com/aistudio/datasetdetail/52980) | [RSOD](https://github.com/RSIA-LIESMARS-WHU/RSOD-Dataset-)   | Object Detection   | 1000± * 1000±                          | 3         | 976      | 4      | jpg      | HBB             | 0.3~ 3m            | __           | Satellite image           | GoogleEarth, Tianditu                                | 2017     | Wuhan University                                                  | https://github.com/RSIA-LIESMARS-WHU/RSOD-Dataset-           | https://aistudio.baidu.com/aistudio/datasetdetail/52980  |
+| [2-10](https://aistudio.baidu.com/aistudio/datasetdetail/53186) | [TGRS HRRSD](https://github.com/CrazyStoneonRoad/TGRS-HRRSD-Dataset) | Object Detection   | 152 * 152~10569 * 10569                | 3         | 21761    | 13     | jpg      | HBB             | 0.15~ 1.2m         | __           | Satellite image           | GoogleEarth, BaiduMap                                | 2017     | Chinese Academy of Sciences                                                    | https://github.com/CrazyStoneonRoad/TGRS-HRRSD-Dataset       | https://aistudio.baidu.com/aistudio/datasetdetail/53186  |
+| [2-11](https://aistudio.baidu.com/aistudio/datasetdetail/77233) | [ships in   satellite imagery](https://www.kaggle.com/rhammell/ships-in-satellite-imagery) | Object Detection   | 80 * 80                                | 3         | 4000     | 1      | png      | HBB             | 3m                 | __           | Satellite image           | Planet                                               | 2017     | PlanetTeam                                                | https://www.kaggle.com/rhammell/ships-in-satellite-imagery   | https://aistudio.baidu.com/aistudio/datasetdetail/77233  |
+| [2-12](https://aistudio.baidu.com/aistudio/datasetdetail/53622) | [xView](https://challenge.xviewdataset.org/data-download)    | Object Detection   | 3000± * 3000±                          | 3         | 1129     | 60     | tif      | HBB             | 0.3m               | __           | Satellite image           | WorldView3                                           | 2018     | DIUx                                                      | https://challenge.xviewdataset.org/data-download             | https://aistudio.baidu.com/aistudio/datasetdetail/53622  |
+| [2-13](https://aistudio.baidu.com/aistudio/datasetdetail/53714) | [LEVIR](http://levir.buaa.edu.cn/Code.htm)                   | Object Detection   | 800 * 600                              | 3         | 22000    | 3      | jpg      | HBB             | 0.2~ 1.0m          | __           | Satellite image           | GoogleEarth                                          | 2018     | Beijing University of Aeronautics and Astronautics                                          | http://levir.buaa.edu.cn/Code.htm                            | https://aistudio.baidu.com/aistudio/datasetdetail/53714  |
+| [2-14](https://aistudio.baidu.com/aistudio/datasetdetail/53895) | [MASATI](https://www.iuii.ua.es/datasets/masati/)            | Object Detection   | 512± * 512±                            | 3         | 7389     | 7      | png      | HBB             | 0.08~ 2m           | __           | Satellite image           | BingMaps                                             | 2018     | University of Alicante                                              | https://www.iuii.ua.es/datasets/masati/                      | https://aistudio.baidu.com/aistudio/datasetdetail/53895  |
+| [2-15](https://aistudio.baidu.com/aistudio/datasetdetail/54674) | [ITCVD](https://research.utwente.nl/en/datasets/itcvd-dataset) | Object Detection   | 5616 * 3744                            | 3         | 135      | 1      | jpg      | HBB             | 0.1m               | __           | Aerial image           | Aerial image                                             | 2018     | University of Twente Research Information                 | https://research.utwente.nl/en/datasets/itcvd-dataset        | https://aistudio.baidu.com/aistudio/datasetdetail/54674  |
+| [2-16](https://aistudio.baidu.com/aistudio/datasetdetail/53045) | [DIOR](http://www.escience.cn/people/JunweiHan/DIOR.html)    | Object Detection   | 800 * 800                              | 3         | 23463    | 20     | jpg      | HBB, OBB        | 0.5~ 30m           | __           | Satellite image           | GoogleEarth                                          | 2019     | Northwestern Polytechnical University                                              | http://www.escience.cn/people/JunweiHan/DIOR.html            | https://aistudio.baidu.com/aistudio/datasetdetail/53045  |
+| [2-17](https://aistudio.baidu.com/aistudio/datasetdetail/73458) | [iSAID](https://captain-whu.github.io/iSAID/index.html)      | Object Detection   | 800 * 800~13000 * 13000                | 3         | 2806     | 15     | png      | OBB             | __                 | __           | Satellite image           | GoogleEarth, JL1, GF2                                | 2019     | Wuhan University                                                  | https://captain-whu.github.io/iSAID/index.html               | https://aistudio.baidu.com/aistudio/datasetdetail/73458  |
+| [2-18](https://aistudio.baidu.com/aistudio/datasetdetail/57457) | [Bridge   Dataset](http://www.patreo.dcc.ufmg.br/2019/07/10/bridge-dataset/) | Object Detection   | 4800 * 2843                            | 3         | 500      | 1      | jpg      | HBB             | 0.5m               | __           | Satellite image           | GoogleEarth, OpenStreetMap                           | 2019     | Federal University of Minas Gerais                        | http://www.patreo.dcc.ufmg.br/2019/07/10/bridge-dataset/     | https://aistudio.baidu.com/aistudio/datasetdetail/57457  |
+| [2-19](https://aistudio.baidu.com/aistudio/datasetdetail/70215) | [RarePlanes](https://www.cosmiqworks.org/RarePlanes/)        | Object Detection   | 512 * 512                              | 3         | 1507     | 10     | png      | HBB             | 0.3~ 1.5m          | __           | Satellite image           | WorldView3                                           | 2020     | In-Q-Tel                                                  | https://www.cosmiqworks.org/RarePlanes/                      | https://aistudio.baidu.com/aistudio/datasetdetail/70215  |
+| [2-20](https://aistudio.baidu.com/aistudio/datasetdetail/131179) | [Aircraft Target Recognition - Training Data Set](https://www.rsaicp.com/portal/dataDetail?id=34) | Object Detection   | 4096 * 4096                            | 3         | 430      | 11     | png      | OBB             | 0.5~ 1m            | __           | Satellite image           | Domestic independent property rights series satellites                                 | 2021     | Chinese Academy of Sciences                                                    | https://www.rsaicp.com/portal/dataDetail?id=34               | https://aistudio.baidu.com/aistudio/datasetdetail/131179 |
+| [2-21](https://aistudio.baidu.com/aistudio/datasetdetail/134218) | [Ship Intelligence Detection - Training Data set](https://www.rsaicp.com/portal/dataDetail?id=35) | Object Detection   | 20000 * 20000                          | 3         | 25       | __     | png      | OBB             | 5m                 | __           | Satellite image           | Domestic independent property rights series satellites                                 | 2021     | Chinese Academy of Sciences                                                    | https://www.rsaicp.com/portal/dataDetail?id=35               | https://aistudio.baidu.com/aistudio/datasetdetail/134218 |
+| [2-22](https://aistudio.baidu.com/aistudio/datasetdetail/78453) | [FAIR1M](http://gaofen-challenge.com)                        | Object Detection   | 1000 * 100~100000 * 10000              | 3         | 15000    | 37     | tif      | OBB             | 0.3~ 0.8m          | __           | Satellite image           | GF, GoogleEarth                                      | 2021     | Chinese Academy of Sciences                                                    | http://gaofen-challenge.com                                  | https://aistudio.baidu.com/aistudio/datasetdetail/78453  |
+| [2-23](https://aistudio.baidu.com/aistudio/datasetdetail/137865) | [VISO-Detection](https://satvideodt.github.io/)              | Object Detection   | 1000 * 1000                            | 3         | 32825    | 4      | jpg      | HBB             | 0.5~ 1.1m          | __           | Satellite image           | JL1                                                  | 2021     | National University of Defense Technology                                              | https://satvideodt.github.io/                                | https://aistudio.baidu.com/aistudio/datasetdetail/137865 |
+| [2-24](https://aistudio.baidu.com/aistudio/datasetdetail/54054) | [VisDrone2019 DET](https://github.com/VisDrone/VisDrone-Dataset) | Object Detection   | 1000± * 1000±                          | 3         | 10209    | 10     | jpg      | HBB             | __                 | __           | Aerial image           | Aerial image                                             | 2018     | Tianjin University                                                  | https://github.com/VisDrone/VisDrone-Dataset                 | https://aistudio.baidu.com/aistudio/datasetdetail/54054  |
+| [2-25](https://aistudio.baidu.com/aistudio/datasetdetail/137265) | [VisDrone2019-VID](https://github.com/VisDrone/VisDrone-Dataset) | Object Detection   | 1000± * 1000±                          | 3         | 40000+   | 10     | jpg      | HBB             | __                 | __           | Aerial image           | Aerial image                                             | 2018     | Tianjin University                                                  | https://github.com/VisDrone/VisDrone-Dataset                 | https://aistudio.baidu.com/aistudio/datasetdetail/137265 |
+| [2-26](https://aistudio.baidu.com/aistudio/datasetdetail/137691) | [DroneCrowd](https://github.com/VisDrone/VisDrone-Dataset)   | Object Detection   | 640*512                                | 3         | 1807     | 1      | jpg      | __              | __                 | __           | Aerial image           | Aerial image                                             | 2020     | Tianjin University                                                  | https://github.com/VisDrone/VisDrone-Dataset                 | https://aistudio.baidu.com/aistudio/datasetdetail/137691 |
+| [2-27](https://aistudio.baidu.com/aistudio/datasetdetail/53125) | [DOTA1.0](https://captain-whu.github.io/DOTA/index.html)     | Object Detection   | 800 * 800~4000 * 4000                  | 3         | 2806     | 15     | png      | OBB             | __                 | __           | Satellite image           | GoogleEarth, GF2, JL0                                | 2018     | Wuhan University                                                  | https://captain-whu.github.io/DOTA/index.html                | https://aistudio.baidu.com/aistudio/datasetdetail/53125  |
+| [2-28](https://aistudio.baidu.com/aistudio/datasetdetail/53125) | [DOTA1.5](https://captain-whu.github.io/DOTA/index.html)     | Object Detection   | 800 * 800~4000 * 4000                  | 3         | 2806     | 16     | png      | OBB             | __                 | __           | Satellite image           | GoogleEarth, GF2, JL1                                | 2019     | Wuhan University                                                  | https://captain-whu.github.io/DOTA/index.html                | https://aistudio.baidu.com/aistudio/datasetdetail/53125  |
+| [2-29](https://aistudio.baidu.com/aistudio/datasetdetail/53125) | [DOTA2.0](https://captain-whu.github.io/DOTA/index.html)     | Object Detection   | 800 * 800~20000 * 20000                | 3         | 11268    | 18     | png      | OBB             | __                 | __           | Satellite image、Aerial image | GoogleEarth, GF2, Aerial image                           | 2021     | Wuhan University                                                  | https://captain-whu.github.io/DOTA/index.html                | https://aistudio.baidu.com/aistudio/datasetdetail/53125  |
+| 2-30                                                         | [COWC](https://gdo152.llnl.gov/cowc/)                        | Object Detection   | 2000 * 2000~19000 * 19000              | __        | 53       | __     | __       | onedot          | 0.15m              | __           | Satellite image           | Utah                                                 | 2016     | Lawrence Livermore National Laboratory                                  | https://gdo152.llnl.gov/cowc/                                |                                                          |
+| 2-31                                                         | [Functional Map of theWorld Challenge](https://github.com/fMoW/dataset) | Object Detection   | __                                     | 4、8      | 31       | __     | __       | onedot          | __                 | __           | Satellite image           | Utah                                                 | 2016     | Johns Hopkins University Applied Physics Laboratory                            | https://github.com/fMoW/dataset                              |                                                          |
+| 2-32                                                         | [CARPK](https://lafi.github.io/LPN/)                         | Object Detection   | __                                     | __        | 1573     | __     | png      | __              | __                 | __           | Aerial image           | Aerial image                                             | 2017     | Taiwan State University                                             | https://lafi.github.io/LPN/                                  |                                                          |
+| 2-33                                                         | [MAFAT   Challenge](https://competitions.codalab.org/competitions/19854) | Object Detection   | __                                     | __        | 4216     | __     | __       | HBB             | 0.05~ 0.15m        | __           | Aerial image           | Aerial image                                             | 2018     | yuvalsh                                                   | https://competitions.codalab.org/competitions/19854          |                                                          |
+| 2-34                                                         | [rscupdet](http://rscup.bjxintong.com.cn/#/theme/2)          | Object Detection   | 1024 * 1024                            | 3         | 2423     | __     | png      | OBB             | __                 | __           | Satellite image           | __                                                   | 2019     | __                                                        | http://rscup.bjxintong.com.cn/#/theme/2                      |                                                          |
+| [2-35](https://aistudio.baidu.com/aistudio/datasetdetail/54806) | [SSDD](https://zhuanlan.zhihu.com/p/58404659)                | Object Detection   | 500 * 500                              | 1         | 1160     | 1      | jpg      | HBB, OBB        | 1~ 15m             | __           | SAR                | RadarSat2, TerraSARX, Sentinel1                      | 2017     | Naval Aeronautical University                                              | https://zhuanlan.zhihu.com/p/58404659                        | https://aistudio.baidu.com/aistudio/datasetdetail/54806  |
+| [2-36](https://aistudio.baidu.com/aistudio/datasetdetail/77017) | [OpenSARShip](http://opensar.sjtu.edu.cn/)                   | Object Detection   | 1200 * 900                             | 1         | 41       | 1      | tif      | Chip            | __                 | __           | SAR                | Sentinel1                                            | 2017     | Shanghai Jiao Tong University                                             | http://opensar.sjtu.edu.cn/                                  | https://aistudio.baidu.com/aistudio/datasetdetail/77017  |
+| [2-37](https://aistudio.baidu.com/aistudio/datasetdetail/54512) | [HRSID](https://github.com/chaozhong2010/HRSID)              | Object Detection   | 800 * 800                              | 1         | 5604     | 1      | png      | HBB             | 0.5~ 3m            | __           | SAR                | Sentinel1B, TerraSARX, TEMX                          | 2020     | University of Electronic Science and Technology                                             | https://github.com/chaozhong2010/HRSID                       | https://aistudio.baidu.com/aistudio/datasetdetail/54512  |
+| [2-38](https://aistudio.baidu.com/aistudio/datasetdetail/54270) | [AIR   SARShip1.0](https://radars.ac.cn/web/data/getData?newsColumnId=abd5c1b2-fe65-47f7-8ebf-990273a91a48) | Object Detection   | 3000 * 3000                            | 1         | 31       | 1      | tiff     | HBB             | 1~ 3m              | __           | SAR                | GF3                                                  | 2019     | Journal of Radar Science                                      | https://radars.ac.cn/web/data/getData?newsColumnId=abd5c1b2-fe65-47f7-8ebf-990273a91a48 | https://aistudio.baidu.com/aistudio/datasetdetail/54270  |
+| [2-39](https://aistudio.baidu.com/aistudio/datasetdetail/54270) | [AIR   SARShip2.0](https://radars.ac.cn/web/data/getData?newsColumnId=1e6ecbcc-266d-432c-9c8a-0b9a922b5e85) | Object Detection   | 1000 * 1000                            | 1         | 300      | 1      | tiff     | HBB             | 1~ 3m              | __           | SAR                | GF3                                                  | 2020     | Journal of Radar Science                                       | https://radars.ac.cn/web/data/getData?newsColumnId=1e6ecbcc-266d-432c-9c8a-0b9a922b5e85 | https://aistudio.baidu.com/aistudio/datasetdetail/54270  |
+| 2-40                                                         | [gf2021 Segmentation data set of offshore aquaculture in high resolution SAR images](http://sw.chreos.org/challenge/dataset/4) | Object Detection   | 600 * 600, 1024 * 1024, 2048 * 2048,   | 1         | 300+     | 6      | tif      | HBB             | 1m                 | __           | SAR                | GF3                                                  | 2021     | Chinese Academy of Sciences                                                    | http://sw.chreos.org/challenge/dataset/4                     |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+| [3-1](https://aistudio.baidu.com/aistudio/datasetdetail/100009) | [SPARCS](https://www.usgs.gov/core-science-systems/nli/landsat/spatial-procedures-automated-removal-cloud-and-shadow-sparcs) | Image Segmentation   | 1000 * 1000                            | 10        | 80       | 7      | tif      | png             | 30m                | __           | Satellite image           | Landsat8                                             | 2014     | University of Tennessee Knoxville                         | https://www.usgs.gov/core-science-systems/nli/landsat/spatial-procedures-automated-removal-cloud-and-shadow-sparcs | https://aistudio.baidu.com/aistudio/datasetdetail/100009 |
+| [3-2](https://aistudio.baidu.com/aistudio/datasetdetail/55102) | [Zurich Summer](https://sites.google.com/view/zhouwx/dataset#h.p_hQS2jYeaFpV0) | Image Segmentation   | 1000± * 1000±                          | 4         | 20       | 8      | tif      | tif             | 0.62m              | __           | Satellite image           | QuickBird                                            | 2015     | TheUniversityofEdinburgh,Sc                               | https://sites.google.com/view/zhouwx/dataset#h.p_hQS2jYeaFpV0 | https://aistudio.baidu.com/aistudio/datasetdetail/55102  |
+| [3-3](https://aistudio.baidu.com/aistudio/datasetdetail/57293) | [ERM PAIW](https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-52776) | Image Segmentation   | 4000± * 4000±                          | 3         | 41       | 1      | png, jpg | tif             | __                 | __           | Aerial image           | Aerial image                                             | 2015     | German AerospaceCenter (DLR)                              | https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-52776 | https://aistudio.baidu.com/aistudio/datasetdetail/57293  |
+| [3-4](https://aistudio.baidu.com/aistudio/datasetdetail/57162) | [HD Maps](https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-52773) | Image Segmentation   | 4000± * 4000±                          | 3         | 20       | 5      | png      | png             | __                 | __           | Aerial image           | Aerial image                                             | 2016     | German AerospaceCenter (DLR)                              | https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-52773 | https://aistudio.baidu.com/aistudio/datasetdetail/57162  |
+| [3-5](https://aistudio.baidu.com/aistudio/datasetdetail/55424) | [BDCI2017](https://www.datafountain.cn/competitions/270)     | Image Segmentation   | 8000± * 8000±                          | 3         | 5        | 5      | png      | png             | __                 | __           | Satellite image           | __                                                   | 2017     | BDCI                                                      | https://www.datafountain.cn/competitions/270                 | https://aistudio.baidu.com/aistudio/datasetdetail/55424  |
+| [3-6](https://aistudio.baidu.com/aistudio/datasetdetail/56140) | [Learning Aerial Image Segmentation   From Online Maps](https://zenodo.org/record/1154821) | Image Segmentation   | 3000± * 3000±                          | 3         | 1671     | 2      | png      | png             | __                 | __           | Satellite image、Aerial image | GoogleMaps, OpenStreetMap                            | 2017     | THZürich                                                  | https://zenodo.org/record/1154821                            | https://aistudio.baidu.com/aistudio/datasetdetail/56140  |
+| 3-7                                                          | [Inria Aerial Image Labeling Dataset](https://project.inria.fr/aerialimagelabeling/) | Image Segmentation   | 5000 * 5000                            | 3         | 180      | 1      | tif      | tif             | 0.3m               | __           | Aerial image           | Aerial image                                             | 2017     | Inria Sophia Antipolis-Mediterranee                       | https://project.inria.fr/aerialimagelabeling/                |                                                          |
+| [3-8](https://aistudio.baidu.com/aistudio/datasetdetail/55589) | [WHDLD](https://sites.google.com/view/zhouwx/dataset#h.p_hQS2jYeaFpV0) | Image Segmentation   | 256 * 256                              | 3         | 4940     | 6      | png      | png             | __                 | __           | Satellite image           | UCMerced                                             | 2018     | Wuhan University                                                  | https://sites.google.com/view/zhouwx/dataset#h.p_hQS2jYeaFpV0 | https://aistudio.baidu.com/aistudio/datasetdetail/55589  |
+| [3-9](https://aistudio.baidu.com/aistudio/datasetdetail/55005) | [DLRSD](https://sites.google.com/view/zhouwx/dataset#h.p_hQS2jYeaFpV0) | Image Segmentation   | 256 * 256                              | 3         | 2100     | 17     | tif      | png             | 1foot              | __           | Satellite image           | USGS National Map                                    | 2018     | Wuhan University                                                  | https://sites.google.com/view/zhouwx/dataset#h.p_hQS2jYeaFpV0 | https://aistudio.baidu.com/aistudio/datasetdetail/55005  |
+| [3-10](https://aistudio.baidu.com/aistudio/datasetdetail/136777) | [Multi-Sensor   Land-Cover Classification](https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-51180) | Image Segmentation   | 5957 * 8149, 6031 * 5596               | 4, 1      | 2        | 4      | tif      | tif             | __                 | __           | Satellite image           | Sentinel1B, Sentinel2A                               | 2018     | Wuhan University                                                  | https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-51180 | https://aistudio.baidu.com/aistudio/datasetdetail/136777 |
+| [3-11](https://aistudio.baidu.com/aistudio/datasetdetail/55222) | [Aeroscapes](https://github.com/ishann/aeroscapes)           | Image Segmentation   | 720 * 720                              | 3         | 3269     | 11     | jpg      | png             | __                 | __           | Aerial image           | Aerial image                                             | 2018     | Carnegie Mellon University                                | https://github.com/ishann/aeroscapes                         | https://aistudio.baidu.com/aistudio/datasetdetail/55222  |
+| [3-12](https://aistudio.baidu.com/aistudio/datasetdetail/74274) | [AIRS](https://www.airs-dataset.com/)                        | Image Segmentation   | 10000 * 10000                          | 3         | 190      | 1      | tif      | tif             | 0.075m             | __           | Aerial image           | LINZ Data Service                                    | 2018     | University of Tokyo                                       | https://www.airs-dataset.com/                                | https://aistudio.baidu.com/aistudio/datasetdetail/74274  |
+| [3-13](https://aistudio.baidu.com/aistudio/datasetdetail/98991) | [RIT 18](https://github.com/rmkemker/RIT-18)                 | Image Segmentation   | 9393 * 5642、8833 * 6918、12446 * 7654 | 7         | 3        | 18     | mat      | npy             | 0.047m             | __           | Satellite image           | Tetracam MicroMCA6                                   | 2018     | Rochester Institute of Technology                         | https://github.com/rmkemker/RIT-18                           | https://aistudio.baidu.com/aistudio/datasetdetail/98991  |
+| [3-14](https://aistudio.baidu.com/aistudio/datasetdetail/79283) | [Drone Deploy](https://github.com/dronedeploy/dd-ml-segmentation-benchmark) | Image Segmentation   | 6000± * 6000±                          | 3         | 55       | 7      | tif      | png             | 0.1m               | __           | Aerial image           | drones                                               | 2019     | Drone Deploy                                              | https://github.com/dronedeploy/dd-ml-segmentation-benchmark  | https://aistudio.baidu.com/aistudio/datasetdetail/79283  |
+| [3-15](https://aistudio.baidu.com/aistudio/datasetdetail/74848) | [Road Tracer](https://github.com/mitroadmaps/roadtracer/)    | Image Segmentation   | 4096 * 4096                            | 3         | 3000     | 1      | png      | png             | 0.6m               | __           | Satellite image           | Googleearth, OSM                                     | 2019     | MIT                                                       | https://github.com/mitroadmaps/roadtracer/                   | https://aistudio.baidu.com/aistudio/datasetdetail/74848  |
+| [3-16](https://aistudio.baidu.com/aistudio/datasetdetail/121515) | [Bijie Landslide Dataset](http://gpcv.whu.edu.cn/data/Bijie_pages.html) | Image Segmentation   | 200± * 200±                            | 3         | 771      | 1      | png      | png             | 0.68m              | __           | Satellite image           | TripleSat                                            | 2019     | Wuhan University                                                  | http://gpcv.whu.edu.cn/data/Bijie_pages.html                 | https://aistudio.baidu.com/aistudio/datasetdetail/121515 |
+| [3-17](https://aistudio.baidu.com/aistudio/datasetdetail/135648) | [GF2 Dataset for 3DFGC](http://gpcv.whu.edu.cn/data/3DFGC_pages.html) | Image Segmentation   | 1417 * 2652 , 1163 * 2120              | 4         | 11       | 5      | tif      | tif             | 4m                 | __           | Satellite image           | GF2                                                  | 2019     | Wuhan University                                                  | http://gpcv.whu.edu.cn/data/3DFGC_pages.html                 | https://aistudio.baidu.com/aistudio/datasetdetail/135648 |
+| [3-18](https://aistudio.baidu.com/aistudio/datasetdetail/134732) | [Semantic Drone Dataset](http://dronedataset.icg.tugraz.at)  | Image Segmentation   | 6000 * 4000                            | 3         | 400      | 22     | png      | png             | __                 | __           | Aerial image           | Aerial image                                             | 2019     | Graz University of Technology                             | http://dronedataset.icg.tugraz.at                            | https://aistudio.baidu.com/aistudio/datasetdetail/134732 |
+| [3-19](https://aistudio.baidu.com/aistudio/datasetdetail/136174) | [WHU Cloud   Dataset](http://gpcv.whu.edu.cn/data/WHU_Cloud_Dataset.html) | Image Segmentation   | 512 * 512                              | 3         | 730      | 1      | tif      | tif             | __                 | __           | Satellite image           | Landsat 8                                            | 2020     | Wuhan University                                                  | http://gpcv.whu.edu.cn/data/WHU_Cloud_Dataset.html           | https://aistudio.baidu.com/aistudio/datasetdetail/136174 |
+| [3-20](https://aistudio.baidu.com/aistudio/datasetdetail/76629) | [Land Cover from Aerial Imagery (landcover_ai)](https://landcover.ai/) | Image Segmentation   | 9000 * 9500、4200 * 4700               | 3         | 41       | 3      | tif      | tif             | 0.25~ 0.5m         | __           | Aerial image           | Aerial image                                             | 2020     | linuxpols                                                 | https://landcover.ai/                                        | https://aistudio.baidu.com/aistudio/datasetdetail/76629  |
+| [3-21](https://aistudio.baidu.com/aistudio/datasetdetail/55774) | [UAVid](https://www.uavid.nl/)                               | Image Segmentation   | 4096 * 2160、3840 * 2160               | 3         | 300      | 8      | png      | png             | __                 | __           | Aerial image           | Aerial image                                             | 2020     | University of Twente                                      | https://www.uavid.nl/                                        | https://aistudio.baidu.com/aistudio/datasetdetail/55774  |
+| [3-22](https://aistudio.baidu.com/aistudio/datasetdetail/51568) | [AI + Remote sensing image](https://www.datafountain.cn/competitions/457) | Image Segmentation   | 256 * 256                              | 3         | 100000   | 8、17  | tif      | tif, png        | 0.1~ 4m            | __           | Satellite image           | GF1, GF2, GF6, GJ2, BJ2, Aerial image                    | 2020     | Organizing Committee of National Artificial Intelligence Competition                                   | https://www.datafountain.cn/competitions/457                 | https://aistudio.baidu.com/aistudio/datasetdetail/51568  |
+| [3-23](https://aistudio.baidu.com/aistudio/datasetdetail/56051) | [BDCI2020](https://www.datafountain.cn/competitions/475)     | Image Segmentation   | 256 * 256                              | 3         | 145981   | 7      | jpg      | png             | __                 | __           | Satellite image           | __                                                   | 2020     | BDCI                                                      | https://www.datafountain.cn/competitions/475                 | https://aistudio.baidu.com/aistudio/datasetdetail/56051  |
+| [3-24](https://aistudio.baidu.com/aistudio/datasetdetail/70361) | [mini Inria   Aerial Image Labeling Dataset](https://tianchi.aliyun.com/competition/entrance/531872/introduction) | Image Segmentation   | 512 * 512                              | 3         | 32500    | 1      | jpg      | csv             | 0.3m               | __           | Aerial image           | Aerial image                                             | 2021     | Tianchi Competition                                                  | https://tianchi.aliyun.com/competition/entrance/531872/introduction | https://aistudio.baidu.com/aistudio/datasetdetail/70361  |
+| [3-25](https://aistudio.baidu.com/aistudio/datasetdetail/121200) | [LoveDA](https://github.com/Junjue-Wang/LoveDA)              | Image Segmentation   | 1024 * 1024                            | 3         | 5987     | 7      | png      | png             | 0.3m               | __           | Satellite image           | GoogleEarth                                          | 2021     | Wuhan University                                                  | https://github.com/Junjue-Wang/LoveDA                        | https://aistudio.baidu.com/aistudio/datasetdetail/121200 |
+| [3-26](https://aistudio.baidu.com/aistudio/datasetdetail/129225) | [MiniFrance-DFC22](https://www.grss-ieee.org/community/technical-committees/2022-ieee-grss-data-fusion-contest/) | Image Segmentation   | 2000±* 2000±                           | 3         | 2322     | 15     | tif      | tif             | __                 | __           | Aerial image           | Aerial image                                             | 2022     | IADF TC                                                   | https://www.grss-ieee.org/community/technical-committees/2022-ieee-grss-data-fusion-contest/ | https://aistudio.baidu.com/aistudio/datasetdetail/129225 |
+| [3-27](https://aistudio.baidu.com/aistudio/datasetdetail/102929) | [Remote sensing building extraction data set_M](https://aistudio.baidu.com/aistudio/datasetdetail/102929) | Image Segmentation   | 512 * 512                              | 3         | 447686   | 1      | __       | __              | __                 | __           | __                 | __                                                   | 2022     | Chongqing Jiaotong University                                             | https://aistudio.baidu.com/aistudio/datasetdetail/102929     | https://aistudio.baidu.com/aistudio/datasetdetail/102929 |
+| [3-28](https://aistudio.baidu.com/aistudio/datasetdetail/56961) | [Massachusetts Roads](https://www.cs.toronto.edu/~vmnih/data/) | Image Segmentation   | 1500 * 1500                            | 3         | 804      | 1      | png      | png             | 1m                 | __           | Aerial image           | Aerial image                                             | 2013     | University of Toronto                                     | https://www.cs.toronto.edu/~vmnih/data/                      | https://aistudio.baidu.com/aistudio/datasetdetail/56961  |
+| [3-29](https://aistudio.baidu.com/aistudio/datasetdetail/57019) | [Massachusetts Builds](https://www.cs.toronto.edu/~vmnih/data/) | Image Segmentation   | 1500 * 1500                            | 3         | 151      | 1      | png      | png             | 1m                 | __           | Aerial image           | Aerial image                                             | 2013     | University of Toronto                                     | https://www.cs.toronto.edu/~vmnih/data/                      | https://aistudio.baidu.com/aistudio/datasetdetail/57019  |
+| [3-30](https://aistudio.baidu.com/aistudio/datasetdetail/55681) | [Deep Globe Land Cover Classification   Challenge](http://deepglobe.org/challenge.html) | Image Segmentation   | 2448 * 2448                            | 3         | 803      | 7      | jpg      | png             | 0.5m               | __           | Satellite image           | DigitalGlobe                                         | 2018     | CVPR                                                      | http://deepglobe.org/challenge.html                          | https://aistudio.baidu.com/aistudio/datasetdetail/55681  |
+| [3-31](https://aistudio.baidu.com/aistudio/datasetdetail/55682) | [Deep Globe Road Detection Challenge](http://deepglobe.org/challenge.html) | Image Segmentation   | 1024 * 1024                            | 3、14     | 6226     | 1      | jpg      | png             | 0.5m               | __           | Satellite image           | DigitalGlobe                                         | 2018     | CVPR                                                      | http://deepglobe.org/challenge.html                          | https://aistudio.baidu.com/aistudio/datasetdetail/55682  |
+| [3-32](https://aistudio.baidu.com/aistudio/datasetdetail/56341) | [WHU Building   Dataset, Satellite datasetⅠ   (globalcities)](https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html) | Image Segmentation   | 512 * 512                              | 3         | 204      | 1      | tif      | tif             | 0.3~ 2.5m          | __           | Satellite image           | QuickBird, Worldviewseries, IKONOS, ZY3              | 2019     | Wuhan University                                                  | https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html | https://aistudio.baidu.com/aistudio/datasetdetail/56341  |
+| [3-33](https://aistudio.baidu.com/aistudio/datasetdetail/56356) | [WHU Building   Dataset, Satellite datasetⅡ (EastAsia)](https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html) | Image Segmentation   | 512 * 512                              | 3         | 17388    | 1      | tif      | tif             | 0.45m              | __           | Satellite image           | __                                                   | 2019     | Wuhan University                                                  | https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html | https://aistudio.baidu.com/aistudio/datasetdetail/56356  |
+| [3-34](https://aistudio.baidu.com/aistudio/datasetdetail/56502) | [WHU Building   Dataset, Aerial imagery dataset](https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html) | Image Segmentation   | 512 * 512                              | 3         | 8189     | 1      | tif      | tif             | 0.3m               | __           | Satellite image           | __                                                   | 2019     | Wuhan University                                                  | https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html | https://aistudio.baidu.com/aistudio/datasetdetail/56502  |
+| [3-35](https://aistudio.baidu.com/aistudio/datasetdetail/98275) | [ORSSD](https://github.com/rmcong/ORSSD-dataset)             | Image Segmentation   | 500± * 500±                            | 3         | 800      | 8      | jpg      | png             | __                 | __           | Satellite image           | GoogleEarth                                          | 2019     | Beijing Jiaotong University                                              | https://github.com/rmcong/ORSSD-dataset                      | https://aistudio.baidu.com/aistudio/datasetdetail/98275  |
+| [3-36](https://aistudio.baidu.com/aistudio/datasetdetail/98372) | [EORSSD](https://hub.fastgit.org/rmcong/EORSSD-dataset]https:/hub.fastgit.org/rmcong/EORSSD-dataset) | Image Segmentation   | 500± * 500±                            | 3         | 2000     | 8      | jpg      | png             | __                 | __           | Satellite image           | GoogleEarth                                          | 2020     | Beijing Jiaotong University                                              | [https://hub.fastgit.org/rmcong/EORSSD-dataset\]https://hub.fastgit.org/rmcong/EORSSD-dataset](https://hub.fastgit.org/rmcong/EORSSD-dataset]https:/hub.fastgit.org/rmcong/EORSSD-dataset) | https://aistudio.baidu.com/aistudio/datasetdetail/98372  |
+| [3-37](https://aistudio.baidu.com/aistudio/datasetdetail/56236) | [38 Cloud: A   Cloud Segmentation Dataset](https://github.com/SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset) | Image Segmentation   | 384 * 384                              | 4         | 8400     | 1      | tif      | tif             | 30m                | __           | Satellite image           | Landsat8                                             | 2018     | Science Simon Fraser University                           | https://github.com/SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset | https://aistudio.baidu.com/aistudio/datasetdetail/56236  |
+| [3-38](https://aistudio.baidu.com/aistudio/datasetdetail/56839) | [95 Cloud: An Extension to 38 Cloud Dataset](https://github.com/SorourMo/95-Cloud-An-Extension-to-38-Cloud-Dataset) | Image Segmentation   | 384 * 384                              | 4         | 34701    | 1      | tif      | tif             | 30m                | __           | Satellite image           | Landsat8                                             | 2020     | Simon Fraser University                                   | https://github.com/SorourMo/95-Cloud-An-Extension-to-38-Cloud-Dataset | https://aistudio.baidu.com/aistudio/datasetdetail/56839  |
+| [3-39](https://aistudio.baidu.com/aistudio/datasetdetail/55153) | [Postdam](https://www2.isprs.org/commissions/comm2/wg4/benchmark/semantic-labeling/) | Image Segmentation   | 6000 * 6000                            | 3         | 38       | 6      | tif      | tif             | 0.05m              | __           | Aerial image           | Aerial image                                             | 2012     | ISPRS                                                     | https://www2.isprs.org/commissions/comm2/wg4/benchmark/semantic-labeling/ | https://aistudio.baidu.com/aistudio/datasetdetail/55153  |
+| [3-40](https://aistudio.baidu.com/aistudio/datasetdetail/55408) | [Vaihingen](https://www2.isprs.org/commissions/comm2/wg4/benchmark/semantic-labeling/) | Image Segmentation   | 2000± * 2000±                          | 3         | 33       | 6      | tif      | tif             | 0.09m              | __           | Aerial image           | Aerial image                                             | 2012     | ISPRS                                                     | https://www2.isprs.org/commissions/comm2/wg4/benchmark/semantic-labeling/ | https://aistudio.baidu.com/aistudio/datasetdetail/55408  |
+| [3-41](https://aistudio.baidu.com/aistudio/datasetdetail/54878) | [GID Fine Land cover Classification   15 classes](https://x-ytong.github.io/project/GID.html) | Image Segmentation   | 7200 * 6800                            | 3、4      | 10       | 5      | tif      | tif             | 0.8~ 10m           | __           | Satellite image           | GF2                                                | 2018     | Wuhan University                                                  | https://x-ytong.github.io/project/GID.html                   | https://aistudio.baidu.com/aistudio/datasetdetail/54878  |
+| [3-42](https://aistudio.baidu.com/aistudio/datasetdetail/54934) | [GID Large scale Classification 5   classes](https://x-ytong.github.io/project/GID.html) | Image Segmentation   | 7200 * 6800                            | 3、4      | 150      | 15     | tif      | tif             | 0.8~ 10m           | __           | Satellite image           | GF2, GF1, JL1, ZY3, Sentinel2A, GoogleEarth          | 2018     | Wuhan University                                                  | https://x-ytong.github.io/project/GID.html                   | https://aistudio.baidu.com/aistudio/datasetdetail/54934  |
+| [3-43](https://aistudio.baidu.com/aistudio/datasetdetail/75675) | [UDD5](https://github.com/MarcWong/UDD)                      | Image Segmentation   | 4096± * 2160±                          | 3         | 160      | 5      | png      | png             | __                 | __           | Aerial image           | DJIPhantom4                                          | 2018     | Peking University                                                  | https://github.com/MarcWong/UDD                              | https://aistudio.baidu.com/aistudio/datasetdetail/75675  |
+| [3-44](https://aistudio.baidu.com/aistudio/datasetdetail/75675) | [UDD6](https://github.com/MarcWong/UDD)                      | Image Segmentation   | 4096± * 2160±                          | 3         | 141      | 6      | png      | png             | __                 | __           | Aerial image           | DJIPhantom4                                          | 2018     | Peking University                                                  | https://github.com/MarcWong/UDD                              | https://aistudio.baidu.com/aistudio/datasetdetail/75675  |
+| [3-45](https://aistudio.baidu.com/aistudio/datasetdetail/57579) | [BH POOLS](http://www.patreo.dcc.ufmg.br/2020/07/29/bh-pools-watertanks-datasets/) | Image Segmentation   | 3840 * 2160                            | 3         | 200      | 1      | jpg      | png             | __                 | __           | Satellite image           | GoogleEarth                                          | 2020     | Federal University of Minas Gerais                        | http://www.patreo.dcc.ufmg.br/2020/07/29/bh-pools-watertanks-datasets/ | https://aistudio.baidu.com/aistudio/datasetdetail/57579  |
+| [3-46](https://aistudio.baidu.com/aistudio/datasetdetail/57579) | [BH WATERTANKS](http://www.patreo.dcc.ufmg.br/2020/07/29/bh-pools-watertanks-datasets/) | Image Segmentation   | 3840 * 2160                            | 3         | 200      | 1      | jpg      | png             | __                 | __           | Satellite image           | GoogleEarth                                          | 2020     | Federal University of Minas Gerais                        | http://www.patreo.dcc.ufmg.br/2020/07/29/bh-pools-watertanks-datasets/ | https://aistudio.baidu.com/aistudio/datasetdetail/57579  |
+| 3-47                                                         | [BDCI2017](https://www.datafountain.cn/competitions/270)     | Image Segmentation   | 8000± * 8000±                          | 3         | 5        | 5      | png      | png             | __                 | __           | __                 | __                                                   | 2017     | __                                                        | https://www.datafountain.cn/competitions/270                 |                                                          |
+| 3-48                                                         | [Deep Globe Building Extraction Challenge](http://deepglobe.org/challenge.html) | Image Segmentation   | 650 * 650                              | 8         | 254586   | 1      | tif      | tif             | 1.24m              | __           | Satellite image           | WorldView3                                           | 2018     | CVPR                                                      | http://deepglobe.org/challenge.html                          |                                                          |
+| 3-49                                                         | [DLR-SkyScapes](https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-58694) | Image Segmentation   | 5616 * 3744                            | 3         | 16       | 31     | __       | __              | 0.13m              | __           | Aerial image           | Aerial image                                             | 2019     | German Aerospace Center                                   | https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-58694 |                                                          |
+| [3-50](https://aistudio.baidu.com/aistudio/datasetdetail/135044) | [Barley Remote Sensing Dataset](https://tianchi.aliyun.com/competition/entrance/231717/information) | Image Segmentation   | 47161 * 50141,   77470* 40650          | 3         | 2        | 4      | png      | png             | __                 | __           | Aerial image           | Aerial image                                             | 2019     | __                                                        | https://tianchi.aliyun.com/competition/entrance/231717/information | https://aistudio.baidu.com/aistudio/datasetdetail/135044 |
+| 3-51                                                         | [rscupseg](http://rscup.bjxintong.com.cn/#/theme/3)          | Image Segmentation   | 7200 * 6800                            | 4         | 20       | 6      | tif      | tif             | 4m                 | __           | Satellite image           | GF2                                                  | 2019     | __                                                        | http://rscup.bjxintong.com.cn/#/theme/3                      |                                                          |
+| 3-52                                                         | [2020 Digital China Innovation Competition](https://tianchi.aliyun.com/competition/entrance/231767/introduction) | Image Segmentation   | 4000± * 4000±                          | 3         | 8        | 1      | tif      | tif             | 0.8m               | __           | Satellite image           | GF2                                                  | 2020     | __                                                        | https://tianchi.aliyun.com/competition/entrance/231767/introduction |                                                          |
+| 3-53                                                         | ["Huawei Cloud Cup" 2020 AI Innovation Application Competition](https://competition.huaweicloud.com/information/1000041322/circumstance?track=107) | Image Segmentation   | 10391 * 33106、34612 * 29810           | 3         | 2        | 1      | png      | png             | 0.8m               | __           | Satellite image           | BJ2                                                  | 2020     | __                                                        | https://competition.huaweicloud.com/information/1000041322/circumstance?track=107 |                                                          |
+| 3-54                                                         | [2021 National Digital Ecological Innovation Competition](https://tianchi.aliyun.com/competition/entrance/531860/rankingList) | Image Segmentation   | 256 * 256                              | 4         | 16017    | __     | tif      | tif             | 0.8~ 2m            | __           | Satellite image           | GF                                                   | 2021     | Zhejiang University                                                  | https://tianchi.aliyun.com/competition/entrance/531860/rankingList |                                                          |
+| 3-55                                                         | [LRSNY](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9333652) | Image Segmentation   | 1000 * 1000                            | 3         | 1368     | __     | __       | __              | 0.5m               | __           | Satellite image           | __                                                   | 2021     | IEEE                                                      | https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9333652 |                                                          |
+| 3-56                                                         | [gf2021 Sea ice target monitoring data set in visible light image facing Ocean 1](http://sw.chreos.org/challenge/dataset/5) | Image Segmentation   | 512 * 512                              | 3         | 2500+    | 1      | png      | png             | 50m                | __           | Satellite image           | HY1                                                  | 2021     | Chinese Academy of Sciences                                                    | http://sw.chreos.org/challenge/dataset/5                     |                                                          |
+| 3-57                                                         | [rsipacseg](http://rsipac.whu.edu.cn/subject_one)            | Image Segmentation   | 512 * 512                              | 4         | 70000    | 9、47  | tif      | png             | 0.8~ 2m            | __           | __                 | __                                                   | 2021     | Wuhan University                                                  | http://rsipac.whu.edu.cn/subject_one                         |                                                          |
+| 3-58                                                         | [2021 National Artificial Intelligence Innovation Contest-Cultivated Land recognition with Remote sensing images](http://www.aiinnovation.com.cn/#/AIcaict/trackDetail) | Image Segmentation   | 256 * 256                              | 3         | 50000    | 1      | png      | png             | 1m                 | __           | __                 | __                                                   | 2021     | __                                                        | http://www.aiinnovation.com.cn/#/AIcaict/trackDetail         |                                                          |
+| [3-59](https://aistudio.baidu.com/aistudio/datasetdetail/82020) | [Salinas scene](http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes) | Image Segmentation   | 512 * 217                              | 224       | 1        | 16     | mat      | mat             | 3.7m               | 10nm         | hyperspectral             | Airborne visual infrared imaging spectrometer/ AVIRIS                       | 2011     | __                                                        | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | https://aistudio.baidu.com/aistudio/datasetdetail/82020  |
+| [3-60](https://aistudio.baidu.com/aistudio/datasetdetail/82020) | [Salinas   Ascene](http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes) | Image Segmentation   | 83 * 86                                | 224       | 1        | 6      | mat      | mat             | 3.7m               | 10nm         | hyperspectral             | Airborne visual infrared imaging spectrometer/ AVIRIS                       | 2011     | __                                                        | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | https://aistudio.baidu.com/aistudio/datasetdetail/82020  |
+| [3-61](https://aistudio.baidu.com/aistudio/datasetdetail/81260) | [Pavia Centre   scene](http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes) | Image Segmentation   | 1096 * 1096                            | 102       | 1        | 9      | mat      | mat             | 1.3m               | 4.3nm        | Aerial image           | Airborne reflection optical spectral imager /ROSIS                        | 2011     | Paviauniversity                                           | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | https://aistudio.baidu.com/aistudio/datasetdetail/81260  |
+| [3-62](https://aistudio.baidu.com/aistudio/datasetdetail/81260) | [Pavia   University scene](http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes) | Image Segmentation   | 610 * 610                              | 103       | 1        | 9      | mat      | mat             | 1.3m               | 4.3nm        | hyperspectral             | Airborne reflection optical spectral imager/ ROSIS                        | 2011     | Paviauniversity                                           | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | https://aistudio.baidu.com/aistudio/datasetdetail/81260  |
+| [3-63](https://aistudio.baidu.com/aistudio/datasetdetail/83251) | [Washington   DCMALL](http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes) | Image Segmentation   | 1280 * 307                             | 191       | 1        | 7      | tif      | tif             | __                 | 10nm         | hyperspectral             | Airborne hyperspectral daa/ Hydice                               | 2013     | Spectral Information Technology                           | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | https://aistudio.baidu.com/aistudio/datasetdetail/83251  |
+| [3-64](https://aistudio.baidu.com/aistudio/datasetdetail/81800) | [Ikennedy   Space Center ](http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes) | Image Segmentation   | 512 * 614                              | 176       | 1        | 13     | mat      | mat             | 0.18m              | 10nm         | hyperspectral             | Airborne visual infrared imaging spectrometer/ AVIRIS                       | 2014     | Center for Space Research-The University of TexasatAustin | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | https://aistudio.baidu.com/aistudio/datasetdetail/81800  |
+| [3-65](https://aistudio.baidu.com/aistudio/datasetdetail/82578) | [Botswana](http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes) | Image Segmentation   | 1476 * 256                             | 145       | 1        | 14     | mat      | mat             | 30m                | __           | hyperspectral             | EO1                                                  | 2014     | __                                                        | http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes | https://aistudio.baidu.com/aistudio/datasetdetail/82578  |
+| [3-66](https://aistudio.baidu.com/aistudio/datasetdetail/80970) | [Indian Pines](https://purr.purdue.edu/publications/1947/1)  | Image Segmentation   | 145 * 145、614 * 1848、2678 * 614      | 224       | 3        | 16     | tif      | tif             | __                 | 10nm         | hyperspectral             | Airborne visual infrared imaging spectrometer/ AVIRIS                       | 2015     | Purdue University                                         | https://purr.purdue.edu/publications/1947/1                  | https://aistudio.baidu.com/aistudio/datasetdetail/80970  |
+| [3-67](https://aistudio.baidu.com/aistudio/datasetdetail/80840) | [HyRANK](https://www2.isprs.org/commissions/comm3/wg4/hyrank/) | Image Segmentation   | 250± * 1000±                           | 176       | 5        | 14     | tif      | tif             | 30m                | __           | hyperspectral             | EO1                                                  | 2018     | National Technical University of Athen                    | https://www2.isprs.org/commissions/comm3/wg4/hyrank/         | https://aistudio.baidu.com/aistudio/datasetdetail/80840  |
+| [3-68](https://aistudio.baidu.com/aistudio/datasetdetail/100218) | [XiongAn   hyperspectral dataset](http://www.hrs-cas.com/a/share/shujuchanpin/2019/0501/1049.html) | Image Segmentation   | 3750 * 1580                            | 250       | 1        | 19     | img      | img             | 0.5m               | 2.4nm        | hyperspectral             | Airborne hyperspectral data/ GF full spectrum multimode imaging spectrometer for aircraft systems | 2019     | Chinese Academy of Sciences                                                    | http://www.hrs-cas.com/a/share/shujuchanpin/2019/0501/1049.html | https://aistudio.baidu.com/aistudio/datasetdetail/100218 |
+| [3-69](https://aistudio.baidu.com/aistudio/datasetdetail/126831) | [AIR PolSAR   Seg](https://radars.ac.cn/web/data/getData?newsColumnId=1e6ecbcc-266d-432c-9c8a-0b9a922b5e85) | Image Segmentation   | 512 * 512                              | 4         | 500      | 6      | tiff     | png             | 8m                 | __           | SAR                | __                                                   | 2021     | Journal of Radar Science                                       | https://radars.ac.cn/web/data/getData?newsColumnId=1e6ecbcc-266d-432c-9c8a-0b9a922b5e85 | https://aistudio.baidu.com/aistudio/datasetdetail/126831 |
+| 3-70                                                         | [gf2021 dataset for segmentation of nearshore aquaculture farms in high-resolution SAR images.](http://sw.chreos.org/challenge/dataset/3) | Image Segmentation   | 512 * 512, 1024 * 1024, 2048 *  2048   | 1         | 6000+    | 1      | png      | png             | 1~ 3m              | __           | SAR                | HS1、GF3                                             | 2021     | Chinese Academy of Sciences                                                    | http://sw.chreos.org/challenge/dataset/3                     |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+| [4-1](https://aistudio.baidu.com/aistudio/datasetdetail/77781) | [SZTAKI INRIA   Air Change](http://web.eee.sztaki.hu/remotesensing/airchange_benchmark.html) | Change Detection   | 952 * 640                              | 3         | 26       | 1      | bmp      | bmp             | 1.5m               | __           | Satellite image           | __                                                   | 2009     | MTASZTAKI                                                 | http://web.eee.sztaki.hu/remotesensing/airchange_benchmark.html | https://aistudio.baidu.com/aistudio/datasetdetail/77781  |
+| [4-2](https://aistudio.baidu.com/aistudio/datasetdetail/79596) | [AIST Building Change Detection   (fixed scale)](https://github.com/gistairc/ABCDdataset) | Change Detection   | 160 * 160                              | 6         | 8506     | 1      | tif      | csv             | 0.4m               | __           | Aerial image           | Aerial image                                             | 2017     | AIST                                                      | https://github.com/gistairc/ABCDdataset                      | https://aistudio.baidu.com/aistudio/datasetdetail/79596  |
+| [4-3](https://aistudio.baidu.com/aistudio/datasetdetail/79596) | [AIST Building Change Detection   (resized)](https://github.com/gistairc/ABCDdataset) | Change Detection   | 128 * 128                              | 3         | 8446     | 1      | tif      | csv             | 0.5m               | __           | Aerial image           | Aerial image                                             | 2017     | AIST                                                      | https://github.com/gistairc/ABCDdataset                      | https://aistudio.baidu.com/aistudio/datasetdetail/79596  |
+| [4-4](https://aistudio.baidu.com/aistudio/datasetdetail/79035) | [WHU Building   Change Detection Dataset](https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html) | Change Detection   | 32207 * 15354                          | 3         | 2        | 1      | tif      | tif, shp        | 0.2m               | __           | Aerial image           | Aerial image                                             | 2018     | Wuhan University                                                  | https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html | https://aistudio.baidu.com/aistudio/datasetdetail/79035  |
+| [4-5](https://aistudio.baidu.com/aistudio/datasetdetail/78676) | [season   varying](https://paperswithcode.com/dataset/cdd-dataset-season-varying) | Change Detection   | 256 * 256                              | 3         | 1600     | 1      | jpg      | jpg             | 0.03~ 1m           | __           | Satellite image           | GoogleEarth (DigitalGlobe)                           | 2018     | GosNIIAS                                                  | https://paperswithcode.com/dataset/cdd-dataset-season-varying | https://aistudio.baidu.com/aistudio/datasetdetail/78676  |
+| [4-6](https://aistudio.baidu.com/aistudio/datasetdetail/72898) | [Onera Satellite Change Detection](https://rcdaudt.github.io/oscd/) | Change Detection   | 600 * 600                              | 13        | 48       | 1      | tif      | tif, png        | 10m                | __           | Satellite image           | Sentinel2                                            | 2018     | Universit´e Paris- Saclay                                 | https://rcdaudt.github.io/oscd/                              | https://aistudio.baidu.com/aistudio/datasetdetail/72898  |
+| [4-7](https://aistudio.baidu.com/aistudio/datasetdetail/70452) | [Multi   temporal SceneWuHan ](http://sigma.whu.edu.cn/newspage.php?q=2019_03_26) | Change Detection   | 7200 * 6000                            | 4         | 380      | 9      | tif      | jpg             | 1m                 | __           | Satellite image           | IKONOSsensor                                         | 2019     | Wuhan University                                                  | http://sigma.whu.edu.cn/newspage.php?q=2019_03_26            | https://aistudio.baidu.com/aistudio/datasetdetail/70452  |
+| [4-8](https://aistudio.baidu.com/aistudio/datasetdetail/139421) | [DSIFN Dataset](https://github.com/GeoZcx/A-deeply-supervised-image-fusion-network-for-change-detection-in-remote-sensing-images/tree/master/dataset) | Change Detection   | 512 * 512                              | 3         | 380      | 9      | tif      | png             | __                 | __           | Satellite image           | GoogleEarth                                          | 2019     | Wuhan University                                                  | https://github.com/GeoZcx/A-deeply-supervised-image-fusion-network-for-change-detection-in-remote-sensing-images/tree/master/dataset | https://aistudio.baidu.com/aistudio/datasetdetail/139421 |
+| [4-9](https://aistudio.baidu.com/aistudio/datasetdetail/86660) | [High Resolution Semantic Change](https://rcdaudt.github.io/hrscd/) | Change Detection   | 10000 * 10000                          | 3         | 291      | 6      | png      | png             | __                 | __           | Aerial image           | IGS’s BDORTHO database                               | 2020     | ETHZürich                                                 | https://rcdaudt.github.io/hrscd/                             | https://aistudio.baidu.com/aistudio/datasetdetail/86660  |
+| [4-10](https://aistudio.baidu.com/aistudio/datasetdetail/73203) | [xBD](https://xview2.org/dataset)                            | Change Detection   | 1024 * 1024                            | 3、4、8   | 22068    | 4      | png      | json            | __                 | __           | Satellite image           | DigitalGlobe                                         | 2020     | MIT                                                       | https://xview2.org/dataset                                   | https://aistudio.baidu.com/aistudio/datasetdetail/73203  |
+| [4-11](https://aistudio.baidu.com/aistudio/datasetdetail/75099) | [Google   dataset](https://github.com/daifeng2016/Change-Detection-Dataset-for-High-Resolution-Satellite-Imagery) | Change Detection   | 1000 * 1000~5000 * 5000                | 3         | 20       | 1      | tif      | png             | 0.55m              | __           | Satellite image           | GoogleEarth                                          | 2020     | IEEE                                                      | https://github.com/daifeng2016/Change-Detection-Dataset-for-High-Resolution-Satellite-Imagery | https://aistudio.baidu.com/aistudio/datasetdetail/75099  |
+| [4-12](https://aistudio.baidu.com/aistudio/datasetdetail/75459) | [LEVIR CD](https://justchenhao.github.io/LEVIR/)             | Change Detection   | 1024 * 1024                            | 3         | 1274     | 1      | png      | png             | 0.5m               | __           | Satellite image           | GoogleEarth                                          | 2020     | Beijing University of Aeronautics and Astronautics                                          | https://justchenhao.github.io/LEVIR/                         | https://aistudio.baidu.com/aistudio/datasetdetail/75459  |
+| [4-13](https://aistudio.baidu.com/aistudio/datasetdetail/53484) | [SenseEarth   Change Detection](https://rs.sensetime.com/competition/index.html#/info) | Change Detection   | 512 * 512                              | 3         | 7630     | 6      | png      | png             | 0.5~ 3m            | __           | Satellite image           | __                                                   | 2020     | Sensetime                                                  | https://rs.sensetime.com/competition/index.html#/info        | https://aistudio.baidu.com/aistudio/datasetdetail/53484  |
+| [4-14](https://aistudio.baidu.com/aistudio/datasetdetail/87088) | [SEmantic Change detectiON Data](http://www.captain-whu.com/project/SCD/) | Change Detection   | 512 * 512                              | 3         | 9324     | 1      | png      | png             | __                 | __           | Aerial image           | Aerial image                                             | 2020     | Wuhan University                                                  | http://www.captain-whu.com/project/SCD/                      | https://aistudio.baidu.com/aistudio/datasetdetail/87088  |
+| [4-15](https://aistudio.baidu.com/aistudio/datasetdetail/104199) | [Sentinel 2 Multitemporal Cities Pairs](https://zenodo.org/record/4280482) | Change Detection   | 600 * 600                              | 14        | 3042     | __     | npy      | self-supervised | 10m                | __           | Satellite image           | Sentinel2                                            | 2020     | Wageningen University                                     | https://zenodo.org/record/4280482                            | https://aistudio.baidu.com/aistudio/datasetdetail/104199 |
+| [4-16](https://aistudio.baidu.com/aistudio/datasetdetail/98596) | [SunYat SenUniversity CD](https://github.com/liumency/SYSU-CD) | Change Detection   | 256 * 256                              | 3         | 40000    | 1      | png      | png             | 0.5m               | __           | Aerial image           | Aerial image                                             | 2021     | Sun Yat-sen University                                                  | https://github.com/liumency/SYSU-CD                          | https://aistudio.baidu.com/aistudio/datasetdetail/98596  |
+| [4-17](https://aistudio.baidu.com/aistudio/datasetdetail/133833) | [S2Looking](https://www.rsaicp.com/portal/dataDetail?id=30)  | Change Detection   | 1024 * 1024                            | 3         | 8000     | 1      | png      | png             | __                 | __           | Aerial image           | Domestic independent property rights series satellites                                 | 2021     | Chinese Academy of Sciences                                                    | https://www.rsaicp.com/portal/dataDetail?id=30               | https://aistudio.baidu.com/aistudio/datasetdetail/133833 |
+| [4-18](https://aistudio.baidu.com/aistudio/datasetdetail/132093) | [LEVIR-CD2](https://www.rsaicp.com/portal/dataDetail?id=27)  | Change Detection   | 1024 * 1024                            | 3         | 890      | 1      | png      | png             | __                 | __           | Satellite image           | GoogleEarth                                          | 2021     | Chinese Academy of Sciences                                                    | https://www.rsaicp.com/portal/dataDetail?id=27               | https://aistudio.baidu.com/aistudio/datasetdetail/132093 |
+| 4-19                                                         | [[Fei Yueyun 2017\] Guangdong Government Data Innovation Competition](https://tianchi.aliyun.com/competition/entrance/231615/information) | Change Detection   | 3000~8106 * 15106                      | 4         | 8        | 1      | tif      | tif             | 0.65m              | __           | Satellite image           | quickbird                                            | 2017     | __                                                        | https://tianchi.aliyun.com/competition/entrance/231615/information |                                                          |
+| 4-20                                                         | [WHU dataset](https://ieeexplore.ieee.org/document/8444434)  | Change Detection   | 512 * 512                              | 3         | 8189     | 1      | __       | __              | 0.25~ 0.3m, 2.7m   | __           | Satellite image,Aerial image | QuickBird, Worldview series, IKONOS,   ZY3, Aerial image | 2018     | Wuhan University                                                  | https://ieeexplore.ieee.org/document/8444434                 |                                                          |
+| 4-21                                                         | [Hi-UCD](https://arxiv.org/abs/2011.03247)                   | Change Detection   | 1024 * 1024                            | 3         | 2586     | 9      | __       | __              | 0.1m               | __           | __                 | __                                                   | 2020     | Wuhan University                                                  | https://arxiv.org/abs/2011.03247                             |                                                          |
+| 4-22                                                         | [rscupcd](http://rscup.bjxintong.com.cn/#/theme/4)           | Change Detection   | 960 * 960                              | 4         | 44       | 1      | tif      | tif             | __                 | __           | __                 | __                                                   | 2020     | __                                                        | http://rscup.bjxintong.com.cn/#/theme/4                      |                                                          |
+| 4-23                                                         | [gf2021 Building census and change detection data sets in high resolution visible light images](http://sw.chreos.org/challenge/dataset/5) | Change Detection   | 512 * 512                              | 3         | 5000     | 1      | png      | png             | __                 | __           | Satellite image           | GF2, JL1                                             | 2021     | Chinese Academy of Sciences                                                    | http://sw.chreos.org/challenge/dataset/5                     |                                                          |
+| 4-24                                                         | [WH-MAVS](http://sigma.whu.edu.cn/newspage.php?q=2021_06_27) | Change Detection   | 200*200                                | 3         | 47134    | 15     | __       | __              | 1.2m               | __           | Satellite image           | GoogleEarth                                          | 2021     | Wuhan University                                                  | http://sigma.whu.edu.cn/newspage.php?q=2021_06_27            |                                                          |
+| 4-25                                                         | [PRCV2021cd](https://captain-whu.github.io/PRCV2021_RS/dataset.html) | Change Detection   | 512 * 512                              | 3         | 16000    | 1      | png      | png             | __                 | __           | __                 | __                                                   | 2021     | PRCV                                                      | https://captain-whu.github.io/PRCV2021_RS/dataset.html       |                                                          |
+| [4-26](https://aistudio.baidu.com/aistudio/datasetdetail/126838) | [rsipac2021cd](http://rsipac.whu.edu.cn/subject_two)         | Change Detection   | 512 * 512                              | 3         | 6388     | 1      | tif      | png             | 1~ 2m              | __           | __                 | __                                                   | 2021     | Wuhan University                                                  | http://rsipac.whu.edu.cn/subject_two                         | https://aistudio.baidu.com/aistudio/datasetdetail/126838 |
+| [4-27](https://aistudio.baidu.com/aistudio/datasetdetail/89523) | [Change   Detection Dataset ](https://gitlab.citius.usc.es/hiperespectral/ChangeDetectionDataset) | Change Detection   | 984 * 740、600 * 500、390 * 200        | 224       | 6        | 3、5   | mat      | mat             | __                 | __           | hyperspectral             | Airborne visual infrared imaging spectrometer/ AVIRIS,EO1Hyperion          | 2019     | __                                                        | https://gitlab.citius.usc.es/hiperespectral/ChangeDetectionDataset | https://aistudio.baidu.com/aistudio/datasetdetail/89523  |
+| [4-28](https://aistudio.baidu.com/aistudio/datasetdetail/98802) | [river chang dataset](http://crabwq.github.io/)              | Change Detection   | 463 * 241                              | 196       | 2        | 1      | mat      | mat             | 30m                | 10nm         | hyperspectral             | EO1Hyperion                                          | 2019     | Northwestern Polytechnical University                                              | http://crabwq.github.io/                                     | https://aistudio.baidu.com/aistudio/datasetdetail/98802  |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+| [5-1](https://aistudio.baidu.com/aistudio/datasetdetail/87506) | [Dstl   Satellite Imagery Feature Detection](https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/data) | Instance Segmentation   | 13348 * 3392、837 * 848、134 * 136     | 3、16     | 57       | 10     | tif      | json            | 0.31m, 1.24m, 7.5m | __           | Satellite image           | WorldView3                                           | 2017     | DefenceScience&TechnologyLaboratory                       | https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/data | https://aistudio.baidu.com/aistudio/datasetdetail/87506  |
+| [5-2](https://aistudio.baidu.com/aistudio/datasetdetail/54858) | [Map Challenge](https://www.crowdai.org/challenges/mapping-challengehttps:/www.jianshu.com/p/90efc39975da) | Instance Segmentation   | 300 * 300                              | 3         | 341058   | 1      | jpg      | json            | __                 | __           | Satellite image           | GoogleMap                                            | 2018     | crowdAI                                                   | [https://www.crowdai.org/challenges/mapping-challengehttps://www.jianshu.com/p/90efc39975da](https://www.crowdai.org/challenges/mapping-challengehttps:/www.jianshu.com/p/90efc39975da) | https://aistudio.baidu.com/aistudio/datasetdetail/54858  |
+| [5-3](https://aistudio.baidu.com/aistudio/datasetdetail/76145) | [Open AI   Tanzania Building Footprint Segmentation Challenge](https://competitions.codalab.org/competitions/20100) | Instance Segmentation   | 40000± * 40000±                        | 3         | 13       | 3      | tif      | json            | __                 | __           | Aerial image           | Aerial image                                             | 2018     | jordan                                                    | https://competitions.codalab.org/competitions/20100          | https://aistudio.baidu.com/aistudio/datasetdetail/76145  |
+| [5-4](https://aistudio.baidu.com/aistudio/datasetdetail/105196) | [Sentinel 2 Cloud Mask Catalogue](https://zenodo.org/record/4172871) | Instance Segmentation   | 1022 * 1022                            | 13        | 513      | 3      | npy      | npy, shp        | 20m                | __           | Satellite image           | Sentinel2                                            | 2020     | __                                                        | https://zenodo.org/record/4172871                            | https://aistudio.baidu.com/aistudio/datasetdetail/105196 |
+| [5-5](https://aistudio.baidu.com/aistudio/datasetdetail/131700) | [CASIA-aircraft](https://www.rsaicp.com/portal/dataDetail?id=16) | Instance Segmentation   | 399 * 399                              | 3         | 40696    | 3      | jpg      | xml             | __                 | __           | Satellite image           | GoogleEarth                                          | 2021     | Chinese Academy of Sciences                                                    | https://www.rsaicp.com/portal/dataDetail?id=16               | https://aistudio.baidu.com/aistudio/datasetdetail/131700 |
+| [5-6](https://aistudio.baidu.com/aistudio/datasetdetail/131586) | [CASIA-Ship](https://www.rsaicp.com/portal/dataDetail?id=14) | Instance Segmentation   | 1680 * 913                             | 3         | 791      | 1      | jpg      | json            | __                 | __           | Satellite image           | GoogleEarth                                          | 2021     | Chinese Academy of Sciences                                                    | https://www.rsaicp.com/portal/dataDetail?id=14               | https://aistudio.baidu.com/aistudio/datasetdetail/131586 |
+| 5-7                                                          | [Airbus Ship Detection Challenge](https://www.kaggle.com/c/airbus-ship-detection) | Instance Segmentation   | 768 * 768                              | 3         | 40000+   | 1      | jpg      | csv             | __                 | __           | Aerial image           | Aerial image                                             | 2018     | Airbus                                                    | https://www.kaggle.com/c/airbus-ship-detection               |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+| [6-1](https://aistudio.baidu.com/aistudio/datasetdetail/76804) | [MLRSNet](https://data.mendeley.com/datasets/7j9bv9vwsx/2)   | Multi-label Classification | 256 * 256                              | 3         | 109161   | 46     | jpg      | csv             | 0.1~ 10m           | __           | Satellite image           | GoogleEarth                                          | 2020     | China University of Geosciences                                              | https://data.mendeley.com/datasets/7j9bv9vwsx/2              | https://aistudio.baidu.com/aistudio/datasetdetail/76804  |
+| 6-2                                                          | [planet:   understanding the amazon from space](https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/data) | Multi-label Classification | 256 * 256                              | 4         | 150000+  | 17     | tif, jpg | csv             | 3~ 3.7m            | __           | Satellite image           | Planet                                               | 2017     | planet                                                    | https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/data |                                                          |
+| 6-3                                                          | [Big Earth Net](https://bigearth.net/)                       | Multi-label Classification | 120 * 12060 * 6020 * 20;               | 3         | 590326   | 43     | tif      | json            | 10m, 20m, 60m      | __           | Satellite image           | Sentinel1, Sentinel2                                 | 2019     | Technical University of Berlin                                              | https://bigearth.net/                                        |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+| [7-1](https://aistudio.baidu.com/aistudio/datasetdetail/91853) | [UAV123](https://cemse.kaust.edu.sa/ivul/uav123)             | Object Tracking   | 720 * 1280                             | 3         | 110000   | 12     | jpg      | txt             | __                 | __           | Aerial image           | UAV (DJIS1000), UAVsimulator                         | 2017     | King Abdullah University of Scienceand Technology         | https://cemse.kaust.edu.sa/ivul/uav123                       | https://aistudio.baidu.com/aistudio/datasetdetail/91853  |
+| [7-2](https://aistudio.baidu.com/aistudio/datasetdetail/138620) | [Dim aircraft target detection and tracking data set in infrared images against ground/air background](http://www.csdata.org/p/387/) | Object Tracking   | 256 * 256                              | 1         | 16177    | 1      | bmp      | txt             | 10~ 100m           | __           | Aerial image           | Aerial image                                             | 2019     | National University of Defense Technology                                              | http://www.csdata.org/p/387/                                 | https://aistudio.baidu.com/aistudio/datasetdetail/138620 |
+| [7-3](https://aistudio.baidu.com/aistudio/datasetdetail/138957) | [Infrared dim dim moving target detection data set in complex background](https://www.scidb.cn/en/detail?dataSetId=808025946870251520) | Object Tracking   | 640 * 512                              | 1         | 150185   | 1      | png      | json            | __                 | __           | Aerial image           | Aerial image                                             | 2021     | National University of Defense Technology                                              | https://www.scidb.cn/en/detail?dataSetId=808025946870251520  | https://aistudio.baidu.com/aistudio/datasetdetail/138957 |
+| [7-4](https://aistudio.baidu.com/aistudio/datasetdetail/137459) | [VisDrone2019-SOT](https://github.com/VisDrone/VisDrone-Dataset) | Object Tracking   | 3840 * 2160                            | 3         | 1393000  | 10     | jpg      | txt             | 1~ 1.2m            | __           | Aerial image           | Aerial image                                             | 2018     | Tianjin University                                                  | https://github.com/VisDrone/VisDrone-Dataset                 | https://aistudio.baidu.com/aistudio/datasetdetail/137459 |
+| [7-5](https://aistudio.baidu.com/aistudio/datasetdetail/137018) | [VisDrone2019-MOT](https://github.com/VisDrone/VisDrone-Dataset) | Object Tracking   | 3840 * 2160                            | 3         | 39988    | 10     | jpg      | txt             | 1~ 1.2m            | __           | Aerial image           | Aerial image                                             | 2018     | Tianjin University                                                  | https://github.com/VisDrone/VisDrone-Dataset                 | https://aistudio.baidu.com/aistudio/datasetdetail/137018 |
+| [7-6](https://aistudio.baidu.com/aistudio/datasetdetail/138105) | [VISO-SOT](https://satvideodt.github.io/)                    | Object Tracking   | 1000 * 1000                            | 3         | 32825    | 4      | jpg      | txt             | 0.5~ 1.1m          | __           | Satellite image           | JL1                                                  | 2021     | National University of Defense Technology                                              | https://satvideodt.github.io/                                | https://aistudio.baidu.com/aistudio/datasetdetail/138105 |
+| [7-7](https://aistudio.baidu.com/aistudio/datasetdetail/138362) | [VISO-MOT](https://satvideodt.github.io/)                    | Object Tracking   | 1000 * 1000                            | 3         | 32825    | 4      | jpg      | txt             | 0.5~ 1.1m          | __           | Satellite image           | JL1                                                  | 2021     | National University of Defense Technology                                              | https://satvideodt.github.io/                                | https://aistudio.baidu.com/aistudio/datasetdetail/138362 |
+| 7-8                                                          | [rscuptrc](http://rscup.bjxintong.com.cn/#/theme/5)          | Object Tracking   | 1000± * 1000±                          | 3         | 6000     | 4      | jpg      | txt             | __                 | __           | __                 | __                                                   | 2019     | __                                                        | http://rscup.bjxintong.com.cn/#/theme/5                      |                                                          |
+| 7-9                                                          | [gf2021 Multi-target tracking data set in high resolution optical satellite video](http://sw.chreos.org/challenge/dataset/6) | Object Tracking   | 1920 * 1080                            | 3         | 20000±   | 2      | jpg      | txt             | 1~ 1.2m            | __           | Satellite image           | JL1                                                  | 2021     | Chinese Academy of Sciences                                                    | http://sw.chreos.org/challenge/dataset/6                     |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+|                                                              |                                                              | __         | __                                     | d         | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+| [8-1](https://aistudio.baidu.com/aistudio/datasetdetail/90740) | [UCM caption](https://github.com/201528014227051/RSICD_optimal) | Image Caption   | 256 * 256                              | 3         | 2100     | 21     | tif      | json            | 1foot              | __           | Satellite image           | USGS National Map                                    | 2016     | Chinese Academy of Sciences                                                    | https://github.com/201528014227051/RSICD_optimal             | https://aistudio.baidu.com/aistudio/datasetdetail/90740  |
+| [8-2](https://aistudio.baidu.com/aistudio/datasetdetail/91126) | [Sydney   caption](https://github.com/201528014227051/RSICD_optimal) | Image Caption   | 500 * 500                              | 3         | 613      | 7      | tif      | json            | 0.5m               | __           | Satellite image           | GoogleEarth                                          | 2016     | Chinese Academy of Sciences                                                    | https://github.com/201528014227051/RSICD_optimal             | https://aistudio.baidu.com/aistudio/datasetdetail/91126  |
+| [8-3](https://aistudio.baidu.com/aistudio/datasetdetail/90307) | [RSICD](https://github.com/201528014227051/RSICD_optimal)    | Image Caption   | 224 * 224                              | 3         | 10921    | 30     | jpg      | json            | __                 | __           | Satellite image           | GoogleEarth, BaiduMap, MapABC, Tianditu              | 2017     | IEEE                                                      | https://github.com/201528014227051/RSICD_optimal             | https://aistudio.baidu.com/aistudio/datasetdetail/90307  |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+|                                                              |                                                              | __         | __                                     | __        | __       | __     | __       | __              | __                 | __           | __                 | __                                                   | __       | __                                                        |                                                              |                                                          |
+| [9-1](https://aistudio.baidu.com/aistudio/datasetdetail/103972) | [Aerialto Map](https://github.com/phillipi/pix2pix)          | Image Generation   | 600 * 600                              | 3         | 2094     | __     | jpg      | jpg             | __                 | __           | Satellite image           | GoogleMaps                                           | 2017     | UCBerkeley                                                | https://github.com/phillipi/pix2pix                          | https://aistudio.baidu.com/aistudio/datasetdetail/103972 |
+| [9-2](https://aistudio.baidu.com/aistudio/datasetdetail/134292) | [SateHaze1k](https://www.dropbox.com/s/k2i3p7puuwl2g59/Haze1k.zip?dl=0) | Image Generation   | 512 * 512                              | 3         | 1200     | __     | png      | png             | __                 | __           | Satellite image           | GF-2, GF-3                                           | 2017     | Tsinghua University                                                  | https://www.dropbox.com/s/k2i3p7puuwl2g59/Haze1k.zip?dl=0    | https://aistudio.baidu.com/aistudio/datasetdetail/134292 |
+| [9-3](https://aistudio.baidu.com/aistudio/datasetdetail/136652) | [WHU MVS](http://gpcv.whu.edu.cn/data/WHU_MVS_Stereo_dataset.html) | Image Generation   | 768 * 384                              | 1         | 28400    | __     | png      | __              | 0.1m               | __           | Aerial image           | Aerial image                                             | 2020     | Wuhan University                                                  | http://gpcv.whu.edu.cn/data/WHU_MVS_Stereo_dataset.html      | https://aistudio.baidu.com/aistudio/datasetdetail/136652 |
+| [9-4](https://aistudio.baidu.com/aistudio/datasetdetail/136652) | [WHU Stereo](http://gpcv.whu.edu.cn/data/WHU_MVS_Stereo_dataset.html) | Image Generation   | 768 * 384                              | 1         | 21868    | __     | png      | __              | 0.1m               | __           | Aerial image           | Aerial image                                             | 2020     | Wuhan University                                                  | http://gpcv.whu.edu.cn/data/WHU_MVS_Stereo_dataset.html      | https://aistudio.baidu.com/aistudio/datasetdetail/136652 |
+| [9-5](https://aistudio.baidu.com/aistudio/datasetdetail/136567) | [WHU TCL SatMVS 1.0](http://gpcv.whu.edu.cn/data/whu_tlc.html) | Image Generation   | 5120 * 5120                            | 1         | 300      | __     | tif, jpg | __              | 2.1m, 2.5m         | __           | Satellite image           | ZY3                                                  | 2021     | Wuhan University                                                  | http://gpcv.whu.edu.cn/data/whu_tlc.html                     | https://aistudio.baidu.com/aistudio/datasetdetail/136567 |
+| [9-6](https://aistudio.baidu.com/aistudio/datasetdetail/136567) | [WHU TCL SatMVS 2.0](http://gpcv.whu.edu.cn/data/whu_tlc.html) | Image Generation   | 768 * 384                              | 1         | 5011     | __     | tif      | __              | 2.1m, 2.5m         | __           | Satellite image           | ZY3                                                  | 2021     | Wuhan University                                                  | http://gpcv.whu.edu.cn/data/whu_tlc.html                     | https://aistudio.baidu.com/aistudio/datasetdetail/136567 |
+| 9-7                                                          | [DLR-ACD](https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-58354/) | Image Generation   | 3619 * 5226                            | 3         | 33       | 1      | __       | __              | 0.045~ 0.15m       | __           | Aerial image           | Aerial image                                             | 2019     | German Aerospace Center                                   | https://www.dlr.de/eoc/en/desktopdefault.aspx/tabid-12760/22294_read-58354/ |                                                          |
+| 9-8                                                          | [SEN12MS-CR](https://mediatum.ub.tum.de/1554803)             | Image Generation   | 256 * 256                              | 13, 2     | 122218   | __     | __       | __              | __                 | __           | Satellite image           | Sentinel1, Sentinel2                                 | 2020     | TUM                                                       | https://mediatum.ub.tum.de/1554803                           |                                                          |

+ 107 - 0
docs/data/rs_data_en.md

@@ -0,0 +1,107 @@
+# Introduction to Remote Sensing Data
+
+## 1 Definition of Remote Sensing and Remote Sensing Images
+
+In a broad sense, remote sensing refers to "remote perception", that is, the remote detection and perception of objects or natural phenomena without direct contact. Remote sensing in the narrow sense generally refers to electromagnetic wave remote sensing technology, that is, the process of detecting electromagnetic wave reflection characteristics by using sensors on a certain platform (such as aircraft or satellite) and extracting information from it. The image data from this process is known as remote sensing imagery and generally includes satellite and aerial imagery. Remote sensing data are widely used in GIS tasks such as spatial analysis, as well as computer vision (CV) fields including scene classification, image segmentation and object detection.
+
+Compared with aerial images, satellite images cover a wider area, so they are used more widely. Common satellite images may be taken by commercial satellites or come from open databases of agencies such as NASA and ESA.
+
+## 2 Characteristics of Remote Sensing Images
+
+Remote sensing technology has the characteristics of macroscopic, multi-band, periodicity and economic. Macroscopic refers to that the higher the remote sensing platform is, the wider the perspective will be, and the wider the ground can be synchronously detected. The multi-band property means that the sensor can detect and record information in different bands such as ultraviolet, visible light, near infrared and microwave. Periodicity means that the remote sensing satellite has the characteristic of acquiring images repeatedly in a certain period, which can carry out repeated observation of the same area in a short time. Economic means that remote sensing technology can be used as a way to obtain large area of surface information without spending too much manpower and material resources.
+
+The characteristics of remote sensing technology determine that remote sensing image has the following characteristics:
+
+1. Large scale. A remote sensing image can cover a vast surface area.
+2. Multispectral. Compared with natural images, remote sensing images often have a larger number of bands.
+3. Rich source. Different sensors, different satellites can provide a variety of data sources.
+
+## 3 Definition of Raster Image and Imaging Principle of Remote Sensing Image
+
+In order to introduce the imaging principle of remote sensing image, the concept of raster should be introduced first. Raster is a pixel-based data format that effectively represents continuous surfaces. The information in the raster is stored in a grid structure, and each information cell or pixel has the same size and shape, but different values. Digital photographs, orthophoto images and satellite images can all be stored in this format.
+
+Raster formats are ideal for analysis that concentrates on spatial and temporal changes because each data value has a grid-based accessible location. This allows us to access the same geographic location in two or more different grids and compare their values.
+
+When the earth observation satellite takes a picture, the sensor will record the DN (Digital Number) value of different wavelength electromagnetic wave in the grid pixel. Through DN, the irradiance and reflectance of ground objects can be inversely calculated. The relationship between them is shown in the following formula, where $gain$ and $bias$ refer to the gain and offset of the sensor respectively; $L$ is irradiance, also known as radiant brightness value; $\rho$ is the reflectance of ground objects; $d_{s}$、$E_{0}$ and $\theta$ respectively represent the distance between solar and earth astronomical units, solar irradiance and solar zenith angle.
+
+$$
+L = gain * DN + bias \\
+\rho = \pi Ld^{2}_{s}/(E_{0}\cos{\theta})
+$$
+
+The electromagnetic spectrum is the result of human beings according to the order of wave length or frequency, wave number, energy, etc. The human eye perceives only a small range of wavelengths in the electromagnetic spectrum, known as visible light, in the range of 0.38 to 0.76μm. That's because our vision evolved to be most sensitive where the sun emits the most light, and is broadly limited to the wavelengths that make up what we call red, green, and blue. But satellite sensors can sense a much wider range of the electromagnetic spectrum, which allows us to sense a much wider range of the spectrum with the help of sensors.
+
+![band](../images/band.jpg)
+
+The electromagnetic spectrum is so wide that it is impractical to use a single sensor to collect information at all wavelengths at once. In practice, different sensors give priority to collecting information from different wavelengths of the spectrum. Each part of the spectrum captured and classified by the sensor is classified as an information strip. The tape varies in size and can be compiled into different types of composite images, each emphasizing different physical properties. At the same time, most remote sensing images are 16-bit images, different from the traditional 8-bit images, which can represent finer spectral information.
+
+## 4 Classification of Remote Sensing Images
+
+Remote sensing image has the characteristics of wide coverage area, large number of bands and rich sources, and its classification is also very diverse. For example, remote sensing image can be divided into low resolution remote sensing image, medium resolution remote sensing image and high resolution remote sensing image according to spatial resolution. According to the number of bands, it can be divided into multi-spectral image, hyperspectral image, panchromatic image and other types. This document is intended to provide a quick guide for developers who do not have a background in remote sensing. Therefore, only a few common types of remote sensing images are described.
+
+### 4.1 RGB Image
+
+RGB images are similar to common natural images in daily life. The features displayed in RGB images are also in line with human visual common sense (for example, trees are green, cement is gray, etc.), and the three channels represent red, green and blue respectively. The figure below shows an RGB remote sensing image:
+
+![rgb](../images/rgb.jpg)
+
+Since the data processing pipelines of most CV tasks are designed based on natural images, remote sensing data sets of RGB type are widely used in CV field.
+
+### 4.2 MSI/HSI Image
+
+MSI (Multispectral Image) and HSI (Hyperspectral Image) usually consist of several to hundreds of bands. The two are distinguished by different spectral resolution (*spectral resolution refers to the value of a specific wavelength range in the electromagnetic spectrum that can be recorded by the sensor; the wider the wavelength range, the lower the spectral resolution*). Usually the spectral resolution in the order of 1/10 of the wavelength is called multispectral. MSI has fewer bands, wider band, and higher spatial resolution, while HSI has more bands, narrower bands, and higher spectral resolution. However, HSI has more bands, narrower bands and higher spectral resolution.
+
+In practice, some specific bands of MSI/HSI are often selected according to application requirements: for example, the transmittance of mid-infrared band is 60%-70%, including ground object reflection and emission spectrum, which can be used to detect high temperature targets such as fire. The red-edge band (*the point where the reflectance of green plants increases fastest between 0.67 and 0.76μm, and is also the inflection point of the first derivative spectrum in this region*) is a sensitive band indicating the growth status of green plants. It can effectively monitor the growth status of vegetation and be used to study plant nutrients, health monitoring, vegetation identification, physiological and biochemical parameters and other information.
+
+The following takes the image of Beijing Daxing Airport taken by Tiangong-1 hyperspectral imager as an example to briefly introduce the concepts of band combination, spectral curve and band selection commonly used in MSI/HSI processing. In the hyperspectral data set of Tiangong-1, bands with low signal-to-noise ratio and information entropy were eliminated based on the evaluation results of band signal-to-noise ratio and information entropy, and some bands were eliminated based on the actual visual results of the image. A total of 54 visible near-infrared spectrum segments, 52 short-wave infrared spectrum segments and the whole chromatographic segment data were retained.
+
+**Band Combination**
+
+Band combination refers to the result obtained by selecting three band data in MSI/HSI to combine and replace the three RGB channels, which is called the color graph (*The result synthesized using the real RGB three bands is called the true color graph, otherwise it is called the false color graph*). The combination of different bands can highlight different features of ground objects. The following figure shows the visual effects of several different combinations:
+
+![Figure 3](../images/band_combination.jpg)
+
+**Spectral Curve Interpretation**
+
+Spectral information can often reflect the features of ground objects, and different bands reflect different features of ground objects. Spectral curves can be drawn by taking the wavelength or frequency of electromagnetic wave as the horizontal axis and the reflectance as the vertical axis. Taking the spectral curve of vegetation as an example, as shown in the figure below, the reflectance of vegetation is greater than 40% in the band of 0.8μm, which is significantly greater than that of about 10% in the band of 0.6μm, so more radiation energy is reflected back during imaging. Reflected in the image, the vegetation appears brighter in the 0.8μm image.
+
+![band_mean](../images/band_mean.jpg)
+
+**Band Selection**
+
+MSI/HSI may contain a larger number of bands. For one thing, not all bands are suitable for the task at hand; on the other hand, too many bands may bring heavy resource burden. In practical applications, partial bands of MSI/HSI can be selected according to the requirements to complete the task, and methods such as PCA and wavelet transform can also be used to reduce the dimension of MSI/HSI, so as to reduce redundancy and save computing resources.
+
+### 4.3 SAR Image
+
+Synthetic Aperture Radar (SAR) refers to active side-looking radar systems. The imaging geometry of SAR belongs to the slant projection type, so SAR image and optical image have great differences in imaging mechanism, geometric features, radiation features and other aspects.
+
+The information of different bands in optical images comes from the reflected energy of electromagnetic waves of different wavelengths, while SAR images record echo information of different polarizations (*that is, the vibration direction of electromagnetic wave transmission and reception*) in binary and complex forms. Based on the recorded complex data, the original SAR image can be transformed to extract the corresponding amplitude and phase information. Human beings cannot directly distinguish the phase information, but they can intuitively perceive the amplitude information, and intensity images can be obtained by using the amplitude information, as shown in the figure below:
+
+![sar](../images/sar.jpg)
+
+Due to the special imaging mechanism of SAR image, its resolution is relatively low, and the signal-to-noise ratio is also low, so the amplitude information contained in SAR image is far from the imaging level of optical image. This is why SAR images are rarely used in the CV field. At present, SAR image is mainly used for settlement detection inversion and 3D reconstruction based on phase information. It is worth mentioning that SAR has its unique advantages in some application scenarios due to its long wavelength and certain cloud and surface penetration ability.
+
+### 4.4 RGBD Image
+
+The difference between RGBD images and RGB images is that there is an extra D channel in RGBD images, namely the depth. Depth images are similar to grayscale images, except that each pixel value represents the actual distance of the sensor from the object. Generally, RGB data and depth data in RGBD images are registered with each other. Depth image provides height information that RGB image does not have, and can distinguish some ground objects with similar spectral characteristics in some downstream tasks.
+
+## 5 The Preprocessing of Remote Sensing Image
+
+Compared with natural images, the preprocessing of remote sensing images is rather complicated. Specifically, it can be divided into the following steps:
+
+1. **Radiometric Calibration**: The DN is converted into radiation brightness value or reflectivity and other physical quantities.
+2. **Atmospheric Correction**: The radiation error caused by atmospheric influence is eliminated and the real surface reflectance of surface objects is retrieved. This step together with radiometric calibration is called **Radiometric Correction**.
+3. **Orthographic Correction**: The oblique correction and projection difference correction were carried out at the same time, and the image was resampled to orthophoto.
+4. **Image Registration**: Match and overlay two or more images taken at different times, from different sensors (imaging equipment) or under different conditions (weather, illumination, camera position and angle, etc.).
+5. **Image Fusion**: The image data of the same object collected by multiple source channels are synthesized into high quality image.
+6. **Image Clipping**: The large remote sensing image was cut into small pieces to extract the region of interest.
+7. **Define Projection**: Define projection information (a geographic coordinate system) on the data.
+
+It should be noted that in practical application, the above steps are not all necessary, and some of them can be performed selectively according to needs.
+
+## Reference Material
+
+- [Remote Sensing in Wikipedia](https://en.wikipedia.org/wiki/Remote_sensing).
+- [Introduction to Surveying and Mapping by Ning Jinsheng et al.](https://book.douban.com/subject/3116967/).
+- [Principles and Applications of Remote Sensing by Sun Jia-liu.](https://book.douban.com/subject/3826668/).
+- [Steps of Remote Sensing Image Preprocessing](https://blog.csdn.net/qq_35093027/article/details/119808941).

+ 3 - 3
docs/data/tools.md

@@ -73,8 +73,8 @@ python match.py --im1_path [时相1影像路径] --im2_path [时相2影像路径
 
 - `im1_path`:时相1影像路径。该影像必须包含地理信息,且配准过程中以该影像为基准影像。
 - `im2_path`:时相2影像路径。该影像的地理信息将不被用到。配准过程中将该影像配准到时相1影像。
-- `im1_bands`:时相1影像用于配准的波段,指定为三通道(分别代表R、G、B)或单通道,默认为[1, 2, 3]。
-- `im2_bands`:时相2影像用于配准的波段,指定为三通道(分别代表R、G、B)或单通道,默认为[1, 2, 3]。
+- `im1_bands`:时相1影像用于配准的波段,指定为三通道(分别代表R、G、B)或单通道,默认为`[1, 2, 3]`
+- `im2_bands`:时相2影像用于配准的波段,指定为三通道(分别代表R、G、B)或单通道,默认为`[1, 2, 3]`
 - `save_path`: 配准后时相2影像输出路径。
 
 ### split
@@ -89,7 +89,7 @@ python split.py --image_path {输入影像路径} [--mask_path {真值标签路
 
 - `image_path`:需要切分的影像的路径。
 - `mask_path`:一同切分的标签影像路径,默认为`None`。
-- `block_size`:切分影像块大小,默认为512。
+- `block_size`:切分影像块大小,默认为`512`
 - `save_dir`:保存切分后结果的文件夹路径,默认为`output`。
 
 ### coco_tools

+ 152 - 0
docs/data/tools_en.md

@@ -0,0 +1,152 @@
+# Remote Sensing Image Processing Toolkit
+
+PaddleRS provides a rich set of remote sensing image processing tools in the `tools` directory, including:
+
+- `coco2mask.py`: Convert COCO annotation files to .png files.
+- `mask2shape.py`: Convert .png format raster labels from model inference output to .shp vector format.
+- `geojson2mask.py`: Convert GeoJSON format labels to .tif raster format.
+- `match.py`: Implement registration of two images.
+- `split.py`: Split large-scale image data into tiles.
+- `coco_tools/`: A collection of COCO tools for processing COCO format annotation files.
+- `prepare_dataset/`: A collection of scripts for preprocessing datasets.
+- `extract_ms_patches.py`: Extract multi-scale image blocks from entire remote sensing images.
+
+## Usage
+
+First, please make sure you have downloaded PaddleRS to your local machine. Navigate to the `tools` directory:
+
+```shell
+cd tools
+```
+
+### coco2mask
+
+The main function of `coco2mask.py` is to convert images and corresponding COCO-formatted segmentation labels into images and labels in .png format, which are stored separately in the `img` and `gt` directories. The relevant data examples can be found in the [Chinese Typical City Building Instance Dataset](https://www.scidb.cn/detail?dataSetId=806674532768153600&dataSetType=journal). For the masks, the saved result is a single-channel pseudocolor image. The usage is as follows:
+
+```shell
+python coco2mask.py --raw_dir {input directory path} --save_dir {output directory path}
+```
+
+Among them:
+
+- `raw_dir`: Directory where the raw data is stored. Images are stored in the `images` subdirectory, and labels are saved in the `xxx.json` format.
+- `save_dir`: Directory where the output results are saved. Images are saved in the `img` subdirectory, and .png format labels are saved in the `gt` subdirectory.
+
+### mask2shape
+
+The main function of `mask2shape.py` is to convert the segmentation results in .png format into shapefile format (vector graphics). The usage is as follows:
+
+```shell
+python mask2shape.py --srcimg_path {path to the original image with geographic information} --mask_path {input segmentation label path} [--save_path {output vector graphics path}] [--ignore_index {index values to be ignored}]
+```
+
+Among them:
+
+- `srcimg_path`: Path to the original image with geographic information, which is required to provide the shapefile with geoprojection coordinate system information.
+- `mask_path`: Path to the .png format segmentation result obtained by the model inference.
+- `save_path`: Path to save the shapefile. The default value is `output`.
+- `ignore_index`: Index value to be ignored in the shapefile, such as the background class in segmentation tasks. The default value is `255`.
+
+### geojson2mask
+
+The main function of `geojson2mask.py` is to convert the GeoJSON-formatted labels to a .tif raster format. The usage is as follows:
+
+```shell
+python geojson2mask.py --srcimg_path {path to the original image with geographic information} --geojson_path {input segmentation label path} --save_path {output path}
+```
+
+Among them:
+
+- `srcimg_path`: Path to the original image file that contains the geospatial information.
+- `geojson_path`: Path to the GeoJSON format label file.
+- `save_path`: Path to save the converted raster file.
+
+### match
+
+The main function of `match.py` is to perform spatial registration on two temporal remote sensing images. The usage is as follows:
+
+```shell
+python match.py --im1_path [path to temporal image 1] --im2_path [path to temporal image 2] --save_path [output path for registered temporal image 2] [--im1_bands 1 2 3] [--im2_bands 1 2 3]
+```
+
+Among them:
+
+- `im1_path`: File path of the first temporal image. This image must contain geospatial information and will be used as the reference image during the registration process.
+- `im2_path`: File path of the second temporal image. The geospatial information of this image will not be used. This image will be registered to the first temporal image during the registration process.
+- `im1_bands`: Bands of the first temporal image used for registration, specified as three channels (representing R, G, and B) or a single channel. Default is `[1, 2, 3]`.
+- `im2_bands`: Bands of the second temporal image used for registration, specified as three channels (representing R, G, and B) or a single channel. Default is `[1, 2, 3]`.
+- `save_path`: Output file path of the second temporal image after registration.
+
+### split
+
+The main function of `split.py` is to divide large remote sensing images into image blocks, which can be used as input for training. The usage is as follows:
+
+```shell
+python split.py --image_path {input image path} [--mask_path {Ground-truth label path}] [--block_size {image block size}] [--save_dir {output directory}]
+```
+
+Among them:
+
+- `image_path`: Path of the image to be split.
+- `mask_path`: Path of the label image to be split together. Default is `None`.
+- `block_size`: Size of the split image blocks. Default is `512`.
+- `save_dir`: Directory to save the split results. Default is `output`.
+
+### coco_tools
+
+There are six tools included in the `coco_tools` directory, each with the following functions
+
+- `json_InfoShow.py`:    Print basic information of each dictionary in a json file.
+- `json_ImgSta.py`:      Generate statistical table and graph of image information in a json file.
+- `json_AnnoSta.py`:     Generate statistical table and graph of annotation information in a json file.
+- `json_Img2Json.py`:    Generate a json file by counting images in a test set.
+- `json_Split.py`:       Split the content of a json file into train set and val set.
+- `json_Merge.py`:       Merge multiple json files into one.
+
+For detailed usage instructions, please refer to [coco_tools Usage Instructions](coco_tools.md).
+
+### prepare_dataset
+
+The `prepare_dataset` directory contains a series of data preprocessing scripts that are mainly used to preprocess open source remote sensing datasets downloaded locally to meet the training, validation, and testing standards of PaddleRS.
+
+Before executing the script, you can use the `--help` option to get help information. For example:
+
+```shell
+python prepare_dataset/prepare_levircd.py --help
+```
+
+The following are common command-line options in the script:
+
+- `--in_dataset_dir`: Path to the downloaded original dataset on your local machine. Example: `--in_dataset_dir downloads/LEVIR-CD`.
+- `--out_dataset_dir`:  Path to the processed dataset. Example: `--out_dataset_dir data/levircd`.
+- `--crop_size`: For datasets that support image cropping, specify the size of the cropped image block. Example: `--crop_size 256`.
+- `--crop_stride`: For datasets that support image cropping, specify the step size of the sliding window during cropping. Example: `--crop_stride 256`.
+- `--seed`: Random seed. It can be used to fix the pseudo-random number sequence generated by the random number generator, so as to obtain a fixed dataset partitioning result. Example: `--seed 1919810`
+- `--ratios`: For datasets that support random subset partitioning, specify the sample ratios of each subset that needs to be partitioned. Example: `--ratios 0.7 0.2 0.1`.
+
+You can refer to [this document](https://github.com/PaddlePaddle/PaddleRS/blob/develop/docs/intro/data_prep.md) to see which preprocessing scripts for datasets are provided by PaddleRS.
+
+### extract_ms_patches
+
+The main function of `extract_ms_patches.py` is to extract image patches containing objects of interest at different scales from the entire remote sensing image using a quadtree. The extracted image patches can be used as training samples for models. The usage is as follows:
+
+```shell
+python extract_ms_patches.py --im_paths {one or more input image paths} --mask_path {Ground-truth label path} [--save_dir {output directory}] [--min_patch_size {minimum patch size}] [--bg_class {background class ID}] [--target_class {target class ID}] [--max_level {maximum level of scale}] [--include_bg] [--nonzero_ratio {threshold of the ratio of nonzero pixels}] [--visualize]
+```
+
+Among them:
+
+- `im_paths`: Path of the source image(s). Multiple paths can be specified.
+- `mask_path`: Path to the ground-truth label.
+- `save_dir`: Path to the directory to save the split result. Default is `output`.
+- `min_patch_size`: Minimum size of the extracted image block (in terms of the number of pixels in the length/width of the image block covered by the leaf nodes in the quadtree). Default is `256`.
+- `bg_class`: Class ID of the background class. Default is `0`.
+- `target_class`: Class ID of the target class. If it is `None`, it means that all classes except the background class are target classes. Default is `None`.
+- `max_level`: Maximum level of scale to retrieve. If it is `None`, it means that there is no limit to the level. Default is `None`.
+- `include_bg`: If specified, also save the image blocks that only contain the background class and do not contain the target class.
+- `--nonzero_ratio`: Specify a threshold. For any source image, if the ratio of nonzero pixels in the image block is less than this threshold, the image block will be discarded. If it is `None`, no filtering will be performed. Default is `None`.
+- `--visualize`: If specified, after the program is executed, the image `./vis_quadtree.png` will be generated, which visualizes the nodes in the quadtree. An example is shown in the following figure:
+
+<div align="center">
+<img src="https://user-images.githubusercontent.com/21275753/189264850-f94b3d7b-c631-47b1-9833-0800de2ccf54.png"  width = "400" />  
+</div>

+ 1 - 1
docs/dev/dev_guide.md

@@ -4,7 +4,7 @@
 
 - [新增遥感专用模型](#1-新增遥感专用模型)
 
-- [新增数据预处理数据增强函数或算子](#2-新增数据预处理数据增强函数或算子)
+- [新增数据预处理数据增强函数或算子](#2-新增数据预处理数据增强函数或算子)
 
 - [新增遥感影像处理工具](#3-新增遥感影像处理工具)
 

+ 100 - 0
docs/dev/dev_guide_en.md

@@ -0,0 +1,100 @@
+# PaddleRS Development Guide
+
+## 0 Catalog
+
+- [Add Remote Sensing Special Model](#1-add-remote-sensing-special-model)
+
+- [Add Data Preprocessing/Data Augmentation Function or Operator](#2-add-data-preprocessing/data-augmentation-function-or-operator)
+
+- [Add Remote Sensing Image Processing Tools](#3-add-remote-sensing-image-processing-tools)
+
+## 1 Add Remote Sensing Special Model
+
+### 1.1 Write Model Definitions
+
+First, find the subdirectory (package) corresponding to the task in `paddlers/rs_models`. The mapping between the task and the subdirectory is as follows:
+
+- Change Detection:`cd`;
+- Scene Classification:`clas`;
+- Object Detection:`det`;
+- Image Restoration:`res`;
+- Image Segmentation:`seg`。
+
+Create a new file in the subdirectory and name it `{model name lowercase}.py`.  Write the complete model definition in the file.
+
+The new model must be a subclass of `paddle.nn.Layer`. For the tasks of image segmentation, object detection, scene classification and image restoration, relevant specifications formulated in development kit [PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)、[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)、[PaddleClas](https://github.com/PaddlePaddle/PaddleClas) and [PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN) should be followed respectively. **For change detection, scene classification and image segmentation tasks, the `num_classes` argument must be passed in the model construction to specify the number of output categories. For image restoration tasks, the `rs_factor` argument must be passed in during model construction to specify the super resolution scaling ratio (for non-super resolution models, this argument is set to `None`).** For the change detection task, the model definition should follow the same specification as the segmentation model, but with the following differences:
+
+- The `forward()` method accepts three input parameters, namely `self`, `t1` and `t2`, where `t1` and `t2` represent the input image of the first and second two phases respectively.
+- For a multi-task change detection model (for example, the model outputs both change detection results and building extraction results of two phases), the class attribute `USE_MULTITASK_DECODER` needs to be specified as `True`. Also in the `OUT_TYPES` attribute set the label type for each element in the list of model forward output. Refer to the definition of `ChangeStar` model.
+
+Note that if a common component exists in a subdirectory. For example, contents in `paddlers/rs_models/cd/layers`, `paddlers/rs_models/cd/backbones` and `paddlers/rs_models/seg/layers` should be reused as much as possible.
+
+### 1.2 Add docstring
+
+You have to add a docstring to the new model, with the original references and links in it (you don't have to be strict about the reference format, but you want to be as consistent as possible with the other models you already have for the task). For detailed annotation specifications, refer to the [Code Annotation Specification](docstring.md). An example is as follows:
+
+```python
+"""
+The ChangeStar implementation with a FarSeg encoder based on PaddlePaddle.
+
+The original article refers to
+    Z. Zheng, et al., "Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery"
+    (https://arxiv.org/abs/2108.07002).
+
+Note that this implementation differs from the original code in two aspects:
+1. The encoder of the FarSeg model is ResNet50.
+2. We use conv-bn-relu instead of conv-relu-bn.
+
+Args:
+    num_classes (int): Number of target classes.
+    mid_channels (int, optional): Number of channels required by the ChangeMixin module. Default: 256.
+    inner_channels (int, optional): Number of filters used in the convolutional layers in the ChangeMixin module.
+        Default: 16.
+    num_convs (int, optional): Number of convolutional layers used in the ChangeMixin module. Default: 4.
+    scale_factor (float, optional): Scaling factor of the output upsampling layer. Default: 4.0.
+"""
+```
+
+### 1.3 Write Trainer Definitions
+
+Please follow these steps:
+
+1. In `paddlers/rs_models/{task subdirectories}`'s `__init__.py`, add `from ... import`, you can refer existing examples in the document.
+
+2. Locate the trainer definition file corresponding to the task in the `paddlers/tasks` directory (for example, the change detection task corresponds to `paddlers/tasks/change_detector.py`).
+
+3. Appends the new trainer definition to the end of the file. The trainer inherits from the related base class (such as `BaseChangeDetector`), overriding the `__init__()` method, and overriding other methods as needed. The trainer's `__init__()` method is written with the following requirements:
+    - For tasks such as change detection, scene classification, object detection and image segmentation, the first input parameter of `__init__()` method is `num_classes`, which represents the number of model output classes. For the tasks of change detection, scene classification and image segmentation, the second input parameter is `use_mixed_loss`, indicating whether the user uses the default definition of mixing loss. The third input parameter is `losses`, which represents the loss function used in training. For the image restoration task, the first parameter is `losses`, meaning the same as above; the second parameter is `rs_factor`, which represents the super resolution scaling ratio; the third parameter is `min_max`, which represents the numeric range of the input and output images.
+    - All input parameters of `__init__()` must have default values, and **in this case, the model receives 3-channel RGB input.**
+    - In `__init__()` you need to update the `params` dictionary, whose key-value pairs will be used as input parameters during model construction.
+
+4. Add the class name of the new trainer to the global variable `__all__`.
+
+It should be noted that for the image restoration task, the forward and backward logic of the model are implemented in the trainer definition. For GAN and other models that need to use multiple networks, please refer to the following specifications for the preparation of the trainer:
+- Override the `build_net()` method to maintain all networks using the `GANAdapter`. The `GANAdapter` object takes two lists as input when it is constructed: the first list contains all generators, where the first element is the main generator; the second list contains all discriminators.
+- Override the `default_loss()` method to build the loss function. If more than one loss function is required in the training process, it is recommended to organize in the form of a dictionary.
+- Override the `default_optimizer()` method to build one or more optimizers. When `build_net()` returns a value of type `GANAdapter`, `parameters` is a dictionary. Where, `parameters['params_g']` is a list containing the state dict of the various generators in order; `parameters['params_d']` is a list that contains the state dict of the individual discriminators in order. If you build more than one optimizer, you should use the `OptimizerAdapter` wrapper on return.
+- Override the `run_gan()` method that accepts four parameters: `net`, `inputs`, `mode`, and `gan_mode` for one of the subtasks in the training process, e.g. forward calculation of generator, forward calculation of discriminator, etc.
+- Rewrite `train_step()` method to write the specific logic of one iteration during model training. The usual approach is to call `run_gan()` multiple times, constructing different `inputs` to work in different `gan_mode` as needed each time, extracting useful fields (e.g. losses) from the `outputs` dictionary returned each time and summarizing them into the final result.
+
+See `ESRGAN` for specific examples of GAN trainers.
+
+## 2 Add Data Preprocessing/Data Augmentation Function or Operator
+
+### 2.1 Add Data Preprocessing/Data Augmentation Functions
+
+Define new function in `paddlers/transforms/functions.py`. If the function needs to be exposed and made available to users, you must add a docstring to it.
+
+### 2.2 Add Data Preprocessing/Data Augmentation Operators
+
+Define new operators in `paddlers/transforms/operators.py`, all operators inherit from `paddlers.transforms.Transform`. The operator's `apply()` method receives a dictionary `sample` as input, takes out the related objects stored in it, and makes in-place modifications to the dictionary after processing, and finally returns the modified dictionary. Only in rare cases do we need to override the `apply()` method when defining an operator. In most cases, you just need to override the `apply_im()`, `apply_mask()`, `apply_bbox()`, and `apply_segm()` methods to handle the image, split label, target box, and target polygon, respectively.
+
+If the operator has a complicated implementation, it is recommended to define functions in `paddlers/transforms/functions.py` and call them in `apply*()` of operators.
+
+After writing the implementation of the operator, **you must write docstring and add the class name in `__all__`.**
+
+## 3 Add Remote Sensing Image Processing Tools
+
+Remote sensing image processing tools are stored in the `tools/` directory. Each tool should be a relatively independent script, independent of the contents of the `paddlers/` directory, which can be executed by the user without installing PaddleRS.
+
+When writing the script, use the Python standard library `argparse` to process the command-line arguments entered by the user and execute the specific logic in the `if __name__ == '__main__':` code block. If you have multiple tools that use the same function or class, define these common components in `tools/utils`.

+ 1 - 1
docs/dev/docstring.md

@@ -193,7 +193,7 @@
 - 不同模块间以**1**个空行分隔。
 - 注意首字母大写以及添加标点符号(尤其是**句号**),符合英语语法规则。
 - 在代码示例内容中可适当加空行以体现层次感。
-- 对于注释中出现的**输入参数名**、**输入参数的属性或方法**以及**文件路径**,使用反引号`\``包围。
+- 对于注释中出现的**输入参数名**、**输入参数的属性或方法**以及**文件路径**,使用反引号\`包围。
 - 每个模块的标题/子标题和具体内容之间需要有换行和缩进,`Examples:`标题与示例代码内容之间插入**1**个空行。
 - 单段描述跨行时需要使用悬挂式缩进。
 

+ 237 - 0
docs/dev/docstring_en.md

@@ -0,0 +1,237 @@
+# PaddleRS Specification for Code Annotation
+
+## 1 Specification for Docstrings
+
+The docstring of the function consists of five modules:
+
+- Function description;
+- Function parameter;
+- (optional) Function return value;
+- (optional) exceptions that the function may throw;
+- (Optional) Use an example.
+
+Class docstring also consists of five modules:
+
+- Class function description;
+- Instantiate parameters required by the class;
+- (optional) The object obtained by instantiating the class;
+- (optional) Exceptions that may be thrown when the class is instantiated;
+- (Optional) Use an example.
+
+The specifications for each module are described in detail below.
+
+### 1.1 Description on Functionality of Function/Class Function
+
+The goal is for the user to understand it quickly. The module can be disassembled into 3 parts, function description + calculation formula + annotation.
+
+- Function Description: Describes the specific functions of the function or class. Since the user does not necessarily have the background knowledge, the necessary details need to be added.
+- (Optional) Calculation formula: If necessary, provide the calculation formula of the function. Formulae are suggested to be written in LaTex grammar.
+- (Optional) Note: If special instructions are required, they can be given in this section.
+
+Example:
+
+```python
+"""
+    Add two tensors element-wise. The equation is:
+
+        out = x + y
+
+    Note: This function supports broadcasting. If you want know more about broadcasting, please
+        refer to :ref:`user_guide_broadcasting` .
+"""
+```
+
+### 1.2 Function Arguments/Class Construction Arguments
+
+Explain clearly the **type**, **meaning** and **default value** (if any) for each parameter.
+
+Note:
+
+- optional parameters to note `optional`, for example: `name (str|None, optinoal)`;
+- If a parameter has a variety of optional type, use `|` to separate;
+- A space should be left between the parameter name and the type.
+- A list or tuple containing an object of a certain type can be represented by `list[{type name}]` and `tuple[{type name}]`. For example, `list[int]` represents a list containing an element of type int. `tuple[int|float]` equivalent to `tuple[int]| tuple[float]`;
+- When using the description of `list[{type name}]` and `tuple[{type name}]`, the default assumption is that the list or tuple parameters are homogeneous (that is, all elements contained in the list or tuple have the same type). If the list or tuple parameters are heterogeneous, it needs to be explained in the literal description.
+- If the separated type is a simple type such as `int`, `Tensor`, etc., there is no need to add a space before and after the `|`. However, if it is multiple complex types such as `list[int]` and `tuple[float]`, a space should be added before and after the `|`.
+- For parameters that have a default value, please explain why we use that default value, not just what the parameter is and what the default value is.
+
+Example:
+
+```python
+"""
+    Args:
+        x (Tensor|np.ndarray): Input tensor or numpy array.
+        points (list[int] | tuple[int|float]): List or tuple of data points.
+        name (str|None, optional): Name for the operation. If None, the operation will not be named.
+            Default: None.
+"""
+```
+
+### 1.3 Return Value/Construct Object
+
+For a function return value, first describe the type of the return value (surrounded by `()`, with the same syntax as the parameter type description), and then explain the meaning of the return value. There is no need to specify the type of the object obtained by instantiating the class.
+
+Example 1:
+
+```python
+"""
+    Returns:
+        (tuple): When label is None, it returns (im, im_info); otherwise it returns (im, im_info, label).
+"""
+```
+
+Example 2:
+
+```python
+"""
+    Returns:
+        (N-D Tensor): A location into which the result is stored.
+"""
+```
+
+Example 3 (In the class definition):
+
+```python
+"""
+    Returns:
+        A callable object of Activation.
+"""
+```
+
+### 1.4 Exceptions That May Be Thrown
+
+You need to give the exception type and the conditions under which it is thrown.
+
+Example:
+
+```python
+"""
+    Raises:
+        ValueError: When memory() is called outside block().
+        TypeError: When init is set and is not a Variable.
+"""
+```
+
+### 1.5 Usage Example
+
+Provide as many examples as possible for various usage scenarios of the function or class, and give the expected results of executing the code in the comments.
+
+Requirement: Users can run the script by copying the sample code. Note that the necessary `import` statements need to be added.
+
+Single example:
+
+```python
+"""
+    Examples:
+
+            import paddle
+            import numpy as np
+
+            paddle.enable_imperative()
+            np_x = np.array([2, 3, 4]).astype('float64')
+            np_y = np.array([1, 5, 2]).astype('float64')
+            x = paddle.imperative.to_variable(np_x)
+            y = paddle.imperative.to_variable(np_y)
+
+            z = paddle.add(x, y)
+            np_z = z.numpy()
+            # [3., 8., 6. ]
+
+            z = paddle.add(x, y, alpha=10)
+            np_z = z.numpy()
+            # [12., 53., 24. ]
+"""
+```
+
+Multi examples:
+
+```python
+"""
+    Examples 1:
+
+        from paddleseg.cvlibs.manager import ComponentManager
+
+        model_manager = ComponentManager()
+
+        class AlexNet: ...
+        class ResNet: ...
+
+        model_manager.add_component(AlexNet)
+        model_manager.add_component(ResNet)
+
+        # Alternatively, pass a sequence:
+        model_manager.add_component([AlexNet, ResNet])
+        print(model_manager.components_dict)
+        # {'AlexNet': <class '__main__.AlexNet'>, 'ResNet': <class '__main__.ResNet'>}
+
+    Examples 2:
+
+        # Use it as a Python decorator.
+        from paddleseg.cvlibs.manager import ComponentManager
+
+        model_manager = ComponentManager()
+
+        @model_manager.add_component
+        class AlexNet: ...
+
+        @model_manager.add_component
+        class ResNet: ...
+
+        print(model_manager.components_dict)
+        # {'AlexNet': <class '__main__.AlexNet'>, 'ResNet': <class '__main__.ResNet'>}
+"""
+```
+
+### 1.6 Grammar
+
+- Wording is accurate, using vocabulary and expressions common in deep learning.
+- The sentences are smooth and in line with English grammar.
+- The document should be consistent in the expression of the same thing, for example, avoid using label sometimes and ground truth sometimes.
+
+### 1.7 Other Points to Note
+
+- Different modules are separated by **1** blank lines.
+- Pay attention to capitalization and punctuation rules in acoordance with English grammer.
+- Blank lines can be placed appropriately in the content of the code sample for a sense of hierarchy.
+- For the **input parameter name**, **the property or method of the input parameter**, and the **file path** that appear in the comment, surround it with \`.
+- Line breaks and indentation are required between each module's title/subtitle and the concrete content, and **1** blank lines should be inserted between the `Examples:` title and the sample code content.
+- Suspension indentation is required when a single paragraph description spans lines.
+
+## 2 Complete Docstring Example
+
+```python
+class Activation(nn.Layer):
+    """
+    The wrapper of activations.
+
+    Args:
+        act (str, optional): Activation name in lowercase. It must be one of {'elu', 'gelu',
+            'hardshrink', 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid',
+            'softmax', 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax',
+            'hsigmoid'}. Default: None, means identical transformation.
+
+    Returns:
+        A callable object of Activation.
+
+    Raises:
+        KeyError: When parameter `act` is not in the optional range.
+
+    Examples:
+
+        from paddleseg.models.common.activation import Activation
+
+        relu = Activation("relu")
+        print(relu)
+        # <class 'paddle.nn.layer.activation.ReLU'>
+
+        sigmoid = Activation("sigmoid")
+        print(sigmoid)
+        # <class 'paddle.nn.layer.activation.Sigmoid'>
+
+        not_exit_one = Activation("not_exit_one")
+        # KeyError: "not_exit_one does not exist in the current dict_keys(['elu', 'gelu', 'hardshrink',
+        # 'tanh', 'hardtanh', 'prelu', 'relu', 'relu6', 'selu', 'leakyrelu', 'sigmoid', 'softmax',
+        # 'softplus', 'softshrink', 'softsign', 'tanhshrink', 'logsigmoid', 'logsoftmax', 'hsigmoid'])"
+    """
+    ...
+```

+ 11 - 0
docs/intro/data_prep_en.md

@@ -0,0 +1,11 @@
+# Dataset Preprocessing Scripts
+
+## List of PaddleRS Supported Data Set Preprocessing Scripts
+
+| Task | Dataset Name | Dataset URLs | Preprocessing Script |
+|-----|-----------|----------|----------|
+| Change Detection | LEVIR-CD | https://justchenhao.github.io/LEVIR/ | [prepare_levircd.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_levircd.py) |
+| Change Detection | Season-varying | https://paperswithcode.com/dataset/cdd-dataset-season-varying | [prepare_svcd.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_svcd.py) |
+| Scene Classification | UC Merced | http://weegee.vision.ucmerced.edu/datasets/landuse.html | [prepare_ucmerced.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_ucmerced.py) |
+| Object Detection | RSOD | https://github.com/RSIA-LIESMARS-WHU/RSOD-Dataset- | [prepare_rsod](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_rsod.py) |
+| Image Segmentation | iSAID | https://captain-whu.github.io/iSAID/ | [prepare_isaid](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_isaid.py) |

+ 74 - 0
docs/intro/indices_en.md

@@ -0,0 +1,74 @@
+# Remote Sensing Index
+
+Through `paddlers.transforms.AppendIndex` operator remote sensing index can be calculated and appended to the input image of the last band. When you build the `AppendIndex` object, you need to pass in the remote sensing index name and a dictionary containing the band-index correspondence (the key in the dictionary is the band name and the index number counts from 1).
+
+## List of PaddleRS Supported Remote Sensing Indices
+
+|Remote sensing index name|Full name|Purpose|Reference|
+|-----------|----|---|--------|
+| `'ARI'` | Anthocyanin Reflectance Index | vegetation | https://doi.org/10.1562/0031-8655(2001)074%3C0038:OPANEO%3E2.0.CO;2 |
+| `'ARI2'` | Anthocyanin Reflectance Index 2 | vegetation | https://doi.org/10.1562/0031-8655(2001)074%3C0038:OPANEO%3E2.0.CO;2 |
+| `'ARVI'` | Atmospherically Resistant Vegetation Index | vegetation | https://doi.org/10.1109/36.134076 |
+| `'AWEInsh'` | Automated Water Extraction Index | Water | https://doi.org/10.1016/j.rse.2013.08.029 |
+| `'AWEIsh'` | Automated Water Extraction Index with Shadows Elimination | Water | https://doi.org/10.1016/j.rse.2013.08.029 |
+| `'BAI'` | Burned Area Index | burned area | https://digital.csic.es/bitstream/10261/6426/1/Martin_Isabel_Serie_Geografica.pdf |
+| `'BI'` | Bare Soil Index | cities and towns | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.465.8749&rep=rep1&type=pdf |
+| `'BLFEI'` | Built-Up Land Features Extraction Index | cities and towns | https://doi.org/10.1080/10106049.2018.1497094 |
+| `'BNDVI'` | Blue Normalized Difference Vegetation Index | vegetation | https://doi.org/10.1016/S1672-6308(07)60027-4 |
+| `'BWDRVI'` | Blue Wide Dynamic Range Vegetation Index | vegetation | https://doi.org/10.2135/cropsci2007.01.0031 |
+| `'BaI'` | Bareness Index | cities and towns | https://doi.org/10.1109/IGARSS.2005.1525743 |
+| `'CIG'` | Chlorophyll Index Green | vegetation | https://doi.org/10.1078/0176-1617-00887 |
+| `'CSI'` | Char Soil Index | burned area | https://doi.org/10.1016/j.rse.2005.04.014 |
+| `'CSIT'` | Char Soil Index Thermal | burned area | https://doi.org/10.1080/01431160600954704 |
+| `'DBI'` | Dry Built-Up Index | cities and towns | https://doi.org/10.3390/land7030081 |
+| `'DBSI'` | Dry Bareness Index | cities and towns | https://doi.org/10.3390/land7030081 |
+| `'DVI'` | Difference Vegetation Index | vegetation | https://doi.org/10.1016/0034-4257(94)00114-3 |
+| `'EBBI'` | Enhanced Built-Up and Bareness Index | cities and towns | https://doi.org/10.3390/rs4102957 |
+| `'EVI'` | Enhanced Vegetation Index | vegetation | https://doi.org/10.1016/S0034-4257(96)00112-5 |
+| `'EVI2'` | Two-Band Enhanced Vegetation Index | vegetation | https://doi.org/10.1016/j.rse.2008.06.006 |
+| `'FCVI'` | Fluorescence Correction Vegetation Index | vegetation | https://doi.org/10.1016/j.rse.2020.111676 |
+| `'GARI'` | Green Atmospherically Resistant Vegetation Index | vegetation | https://doi.org/10.1016/S0034-4257(96)00072-7 |
+| `'GBNDVI'` | Green-Blue Normalized Difference Vegetation Index | vegetation | https://doi.org/10.1016/S1672-6308(07)60027-4 |
+| `'GLI'` | Green Leaf Index | vegetation | http://dx.doi.org/10.1080/10106040108542184 |
+| `'GRVI'` | Green Ratio Vegetation Index | vegetation | https://doi.org/10.2134/agronj2004.0314 |
+| `'IPVI'` | Infrared Percentage Vegetation Index | vegetation | https://doi.org/10.1016/0034-4257(90)90085-Z |
+| `'LSWI'` | Land Surface Water Index | Water | https://doi.org/10.1016/j.rse.2003.11.008 |
+| `'MBI'` | Modified Bare Soil Index | cities and towns | https://doi.org/10.3390/land10030231 |
+| `'MGRVI'` | Modified Green Red Vegetation Index | vegetation | https://doi.org/10.1016/j.jag.2015.02.012 |
+| `'MNDVI'` | Modified Normalized Difference Vegetation Index | vegetation | https://doi.org/10.1080/014311697216810 |
+| `'MNDWI'` | Modified Normalized Difference Water Index | Water | https://doi.org/10.1080/01431160600589179 |
+| `'MNLI'` | Modified Non-Linear Vegetation Index | vegetation | https://doi.org/10.1109/TGRS.2003.812910 |
+| `'MSI'` | Moisture Stress Index | vegetation | https://doi.org/10.1016/0034-4257(89)90046-1 |
+| `'NBLI'` | Normalized Difference Bare Land Index | cities and towns | https://doi.org/10.3390/rs9030249 |
+| `'NDSI'` | Normalized Difference Snow Index | Snow | https://doi.org/10.1109/IGARSS.1994.399618 |
+| `'NDVI'` | Normalized Difference Vegetation Index | vegetation | https://ntrs.nasa.gov/citations/19740022614 |
+| `'NDWI'` | Normalized Difference Water Index | Water | https://doi.org/10.1080/01431169608948714 |
+| `'NDYI'` | Normalized Difference Yellowness Index | vegetation | https://doi.org/10.1016/j.rse.2016.06.016 |
+| `'NIRv'` | Near-Infrared Reflectance of Vegetation | vegetation | https://doi.org/10.1126/sciadv.1602244 |
+| `'PSRI'` | Plant Senescing Reflectance Index | vegetation | https://doi.org/10.1034/j.1399-3054.1999.106119.x |
+| `'RI'` | Redness Index | vegetation | https://www.documentation.ird.fr/hor/fdi:34390 |
+| `'SAVI'` | Soil-Adjusted Vegetation Index | vegetation | https://doi.org/10.1016/0034-4257(88)90106-X |
+| `'SWI'` | Snow Water Index | Snow | https://doi.org/10.3390/rs11232774 |
+| `'TDVI'` | Transformed Difference Vegetation Index | vegetation | https://doi.org/10.1109/IGARSS.2002.1026867 |
+| `'UI'` | Urban Index | cities and towns | https://www.isprs.org/proceedings/XXXI/congress/part7/321_XXXI-part7.pdf |
+| `'VIG'` | Vegetation Index Green | vegetation | https://doi.org/10.1016/S0034-4257(01)00289-9 |
+| `'WI1'` | Water Index 1 | Water | https://doi.org/10.3390/rs11182186 |
+| `'WI2'` | Water Index 2 | Water | https://doi.org/10.3390/rs11182186 |
+| `'WRI'` | Water Ratio Index | Water | https://doi.org/10.1109/GEOINFORMATICS.2010.5567762 |
+
+## Band Name and Description
+
+|    Band name    |     Description    | Reference wavelength range(μm) |  \*Reference wavelength source  |
+|---------------|------------------|------------------|------------------|
+|     `'b'`     | Blue             | *0.450 - 0.515*  | *Landsat8*       |
+|     `'g'`     | Green            | *0.525 - 0.600*  | *Landsat8*       |
+|     `'r'`     | Red              | *0.630 - 0.680*  | *Landsat8*       |
+|    `'re1'`    | Red Edge 1       | *0.698 - 0.713*  | *Sentinel2*      |
+|    `'re2'`    | Red Edge 2       | *0.733 - 0.748*  | *Sentinel2*      |
+|    `'re3'`    | Red Edge 3       | *0.773 - 0.793*  | *Sentinel2*      |
+|     `'n'`     | NIR              | *0.845 - 0.885*  | *Landsat8*       |
+|    `'s1'`     | SWIR 1           | *1.560 - 1.660*  | *Landsat8*       |
+|    `'s2'`     | SWIR 2           | *2.100 - 2.300*  | *Landsat8*       |
+|    `'t'`      | Thermal Infrared | *10.40 - 12.50*  | *Landsat7*       |
+|    `'t1'`     | Thermal 1        | *10.60 - 11.19*  | *Landsat8*       |
+|    `'t2'`     | Thermal 2        | *11.50 - 12.51*  | *Landsat8*       |

+ 42 - 0
docs/intro/model_zoo_en.md

@@ -0,0 +1,42 @@
+# Model Zoo
+
+PaddleRS' base model library comes from the PaddleCV development kits: [PaddleClas](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/en/algorithm_introduction/ImageNet_models_en.md), [PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.4/docs/model_zoo_overview.md), [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.3/README_en.md) and [PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN/blob/develop/README.md). In addition, PaddleRS also contains a series of remote sensing feature models, which can be used for remote sensing image segmentation and change detection.
+
+## List of PaddleRS Supported Models
+
+All models currently supported by PaddleRS are as follows (those marked \* are dedicated models for remote sensing) :
+
+| Task | Model | Multiband Support |
+|--------|---------|------|
+| Change Detection | \*BIT | Yes |
+| Change Detection | \*CDNet | Yes |
+| Change Detection | \*ChangeFormer | Yes |
+| Change Detection | \*ChangeStar | No |
+| Change Detection | \*DSAMNet | Yes |
+| Change Detection | \*DSIFN | No |
+| Change Detection | \*FC-EF | Yes |
+| Change Detection | \*FC-Siam-conc | Yes |
+| Change Detection | \*FC-Siam-diff | Yes |
+| Change Detection | \*FCCDN | Yes |
+| Change Detection | \*P2V-CD | Yes |
+| Change Detection | \*SNUNet | Yes |
+| Change Detection | \*STANet | Yes |
+| Scene Classification | CondenseNet V2 | Yes |
+| Scene Classification | HRNet | No |
+| Scene Classification | MobileNetV3 | No |
+| Scene Classification | ResNet50-vd | No |
+| Image Restoration | DRN | No |
+| Image Restoration | ESRGAN | Yes |
+| Image Restoration | LESRCNN | No |
+| Object Detection | Faster R-CNN | No |
+| Object Detection | PP-YOLO | No |
+| Object Detection | PP-YOLO Tiny | No |
+| Object Detection | PP-YOLOv2 | No |
+| Object Detection | YOLOv3 | No |
+| Image Segmentation | BiSeNet V2 | Yes |
+| Image Segmentation | DeepLab V3+ | Yes |
+| Image Segmentation | \*FactSeg | Yes |
+| Image Segmentation | \*FarSeg | Yes |
+| Image Segmentation | Fast-SCNN | Yes |
+| Image Segmentation | HRNet | Yes |
+| Image Segmentation | UNet | Yes |

+ 35 - 0
docs/intro/transforms_en.md

@@ -0,0 +1,35 @@
+# Data Transformation Operator
+
+## List of PaddleRS Supported Data Transformation Operators
+
+PaddleRS has organically integrated the data preprocessing/data augmentation (collectively called data transformation) strategies required by different remote sensing tasks, and designed a unified operator. Considering the multi-band characteristics of remote sensing images, most data processing operators of PaddleRS can process input of any number of bands. All data transformation operators currently provided by PaddleRS are listed as follows:
+
+| The name of the data transformation operator | Purpose                                                     | Task     | ... |
+| -------------------- | ------------------------------------------------- | -------- | ---- |
+| AppendIndex          | Calculate the remote sensing index and add it to the input image. | All tasks  | ... |  
+| CenterCrop           | Perform center cropping on the input image. | All tasks | ... |
+| Dehaze               | Dehaze the input image. | All tasks | ... |
+| MatchRadiance        | Perform relative radiometric correction on the input images of two different temporal phases. | Change Detection | ... |
+| MixupImage           | Mix the two images (and their corresponding object detection annotations) together as a new sample. | Object Detection | ... |
+| Normalize            | Apply normalization to the input image. | All tasks | ... |
+| Pad                  | Fill the input image to the specified size. | All tasks | ... |
+| RandomBlur           | Apply random blurring to the input. | All tasks | ... |
+| RandomCrop           | Perform a random center crop on the input image. | All tasks | ... |
+| RandomDistort        | Apply random color transformation to the input. | All tasks | ... |
+| RandomExpand         | Extend the input image based on random offsets. | All tasks | ... |
+| RandomHorizontalFlip | Randomly flip the input image horizontally. | All tasks | ... |
+| RandomResize         | Randomly adjust the size of the input image. | All tasks | ... |
+| RandomResizeByShort  | Randomly adjust the size of the input image while maintaining the aspect ratio unchanged (scaling factor is calculated based on the shorter edge). | All tasks | ... |
+| RandomScaleAspect    | Crop the input image and re-scale it to its original size. | All tasks | ... |
+| RandomSwap           | Randomly exchange the input images of the two phases. | Change Detection | ... |
+| RandomVerticalFlip   | Flip the input image vertically at random. | All tasks | ... |
+| ReduceDim            | Reduce the number of bands in the input image. | All tasks | ... |
+| Resize               | Resize the input image. | All tasks | ... |
+| ResizeByLong         | Resize the input image, keeping the aspect ratio unchanged (calculate the scaling factor based on the long side). | All tasks | ... |
+| ResizeByShort        | Resize the input image, keeping the aspect ratio unchanged (calculate the scaling factor according to the short edge). | All tasks | ... |
+| SelectBand           | Select the band of the input image. | All tasks | ... |
+| ...                  | ... | ... | ... |
+
+## Combinatorial Operator
+
+In the actual model training process, it is often necessary to combine a variety of data preprocessing and data augmentation strategies. PaddleRS provides `paddlers.transforms.Compose` to easily combine multiple data transformation operators so that they can be executed serially. For the specific usage of the `paddlers.transforms.Compose` please see [API Description](https://github.com/PaddlePaddle/PaddleRS/blob/develop/docs/apis/data.md).

+ 2 - 2
tutorials/train/README.md

@@ -98,10 +98,10 @@ docker build -t <imageName> .  # 默认为2.4.1的cpu版本
 # 其余Tag可以参考:https://hub.docker.com/r/paddlepaddle/paddle/tags
 ```
 
-2. 启动镜像
+2. 启动容器
 
 ```shell
-docker iamges  # 查看镜像的ID
+docker images  # 查看镜像的ID
 docker run -it <imageID>
 ```
 

+ 133 - 0
tutorials/train/README_en.md

@@ -0,0 +1,133 @@
+# Tutorial - Training Model
+
+Sample code using the PaddleRS training model is curated in this directory. The code provides automatic downloading of sample data, and uses GPU to train the model.
+
+|Sample code path | Task | Model |
+|------|--------|---------|
+|change_detection/bit.py | Change Detection | BIT |
+|change_detection/cdnet.py | Change Detection | CDNet |
+|change_detection/changeformer.py | Change Detection | ChangeFormer |
+|change_detection/dsamnet.py | Change Detection | DSAMNet |
+|change_detection/dsifn.py | Change Detection | DSIFN |
+|change_detection/fc_ef.py | Change Detection | FC-EF |
+|change_detection/fc_siam_conc.py | Change Detection | FC-Siam-conc |
+|change_detection/fc_siam_diff.py | Change Detection | FC-Siam-diff |
+|change_detection/fccdn.py | Change Detection | FCCDN |
+|change_detection/p2v.py | Change Detection | P2V-CD |
+|change_detection/snunet.py | Change Detection | SNUNet |
+|change_detection/stanet.py | Change Detection | STANet |
+|classification/condensenetv2.py | Scene Classification | CondenseNet V2 |
+|classification/hrnet.py | Scene Classification | HRNet |
+|classification/mobilenetv3.py | Scene Classification | MobileNetV3 |
+|classification/resnet50_vd.py | Scene Classification | ResNet50-vd |
+|image_restoration/drn.py | Image Restoration | DRN |
+|image_restoration/esrgan.py | Image Restoration | ESRGAN |
+|image_restoration/lesrcnn.py | Image Restoration | LESRCNN |
+|object_detection/faster_rcnn.py | Object Detection | Faster R-CNN |
+|object_detection/ppyolo.py | Object Detection | PP-YOLO |
+|object_detection/ppyolo_tiny.py | Object Detection | PP-YOLO Tiny |
+|object_detection/ppyolov2.py | Object Detection | PP-YOLOv2 |
+|object_detection/yolov3.py | Object Detection | YOLOv3 |
+|semantic_segmentation/bisenetv2.py | Image Segmentation | BiSeNet V2 |
+|semantic_segmentation/deeplabv3p.py | Image Segmentation | DeepLab V3+ |
+|semantic_segmentation/factseg.py | Image Segmentation | FactSeg |
+|semantic_segmentation/farseg.py | Image Segmentation | FarSeg |
+|semantic_segmentation/fast_scnn.py | Image Segmentation | Fast-SCNN |
+|semantic_segmentation/hrnet.py | Image Segmentation | HRNet |
+|semantic_segmentation/unet.py | Image Segmentation | UNet |
+
+## Environmental Preparation
+
++ [PaddlePaddle installation](https://www.paddlepaddle.org.cn/install/quick)
+  - Version requirements: PaddlePaddle>=2.2.0
+
++ PaddleRS installation
+
+The PaddleRS code will be updated as the development progresses. You can install the develop branch to use the latest features as follows:
+
+```shell
+git clone https://github.com/PaddlePaddle/PaddleRS
+cd PaddleRS
+git checkout develop
+pip install -r requirements.txt
+python setup.py install
+```
+
+If the downloading of dependencies is slow or times out when using `python setup.py install`, you can create `setup.cfg` in the same directory as `setup.py` and with the following content, then the download can be accelerated through Tsinghua source:
+
+```
+[easy_install]
+index-url=https://pypi.tuna.tsinghua.edu.cn/simple
+```
+
++ (Optional) GDAL installation
+
+PaddleRS supports reading of various types of satellite data. To use the full data reading functionality of PaddleRS, you need to install GDAL as follows:
+
+  - Linux / MacOS
+
+conda is recommended for installation:
+
+```shell
+conda install gdal
+```
+
+  - Windows
+
+Windows users can download GDAL wheels from [this site](https://www.lfd.uci.edu/~gohlke/pythonlibs/#gdal). Please choose the wheel according to the Python version and the platform. Take *GDAL‑3.3.3‑cp39‑cp39‑win_amd64.whl* as an example, run the following command to install:
+
+```shell
+pip install GDAL‑3.3.3‑cp39‑cp39‑win_amd64.whl
+```
+
+### *Docker Installation
+
+1. Pull from dockerhub:
+
+```shell
+docker pull paddlepaddle/paddlers:1.0.0
+```
+
+- (Optional) Build from scratch. Select the base image for PaddlePaddle by setting `PPTAG`. You can build the image in a CPU-only environment or in GPU environments.
+
+```shell
+git clone https://github.com/PaddlePaddle/PaddleRS
+cd PaddleRS
+docker build -t <imageName> .  # default is 2.4.1-cpu version
+# docker build -t <imageName> . --build-arg PPTAG=2.4.1-gpu-cuda10.2-cudnn7.6-trt7.0  # One of the gpu versions of 2.4.1
+# Other Tag refer to: https://hub.docker.com/r/paddlepaddle/paddle/tags
+```
+
+2. Start a container
+
+```shell
+docker images  # View the ID of an image
+docker run -it <imageID>
+```
+
+## Start Training
+
++ After PaddleRS is installed, run the following commands to launch training with a single GPU. The script will automatically download the training data. Take DeepLab V3+ image segmentation model as an example:
+
+```shell
+# Specifies the GPU device number to be used
+export CUDA_VISIBLE_DEVICES=0
+python tutorials/train/semantic_segmentation/deeplabv3p.py
+```
+
++ If multiple GPUs are required for training, for example, two graphics cards, run the following command:
+
+```shell
+python -m paddle.distributed.launch --gpus 0,1 tutorials/train/semantic_segmentation/deeplabv3p.py
+```
+
+## VisualDL Visual Training Metrics
+
+Set the `use_vdl` argument passed to the `train()` method to `True`, and then the training log will be automatically saved in VisualDL format in a subdirectory named `vdl_log` under the directory specified by `save_dir`(a user-specified path) during the model training process. You can run the following command to start the VisualDL service and view the indicators and metrics. We also take DeepLab V3+ as an example:
+
+```shell
+# The specified port number is 8001
+visualdl --logdir output/deeplabv3p/vdl_log --port 8001
+```
+
+Once the service is started, open https://0.0.0.0:8001 or https://localhost:8001 in your browser to access the VisualDL page.