|
@@ -0,0 +1,478 @@
|
|
|
+# PaddleRS Model Construction Parameters
|
|
|
+
|
|
|
+This document describes the construction parameters of each PaddleRS model trainer in detail, including their parameter names, parameter types, parameter descriptions and default values.
|
|
|
+
|
|
|
+## `BIT`
|
|
|
+
|
|
|
+The BIT implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to H. Chen, et al., "Remote Sensing Image Change Detection With Transformers "(https://arxiv.org/abs/2103.00208).
|
|
|
+
|
|
|
+This implementation adopts pretrained encoders, as opposed to the original work where weights are randomly initialized.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|-------------|
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool, optional)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list, optional)` | List of loss functions | `None` |
|
|
|
+| `att_type (str, optional)` | Spatial attention type, optional values are `'CBAM'` and `'BAM'` | `'CBAM'` |
|
|
|
+| `ds_factor (int, optional)` | Downsampling factor | `1` |
|
|
|
+| `backbone (str, optional)` | ResNet architecture to use as backbone. Currently only `'resnet18'` and `'resnet34'` are supported | `'resnet18'` |
|
|
|
+| `n_stages (int, optional)` | Number of ResNet stages used in the backbone, should be a value in `{3, 4, 5}` | `4` |
|
|
|
+| `use_tokenizer (bool, optional)` | Whether to use tokenizer | `True` |
|
|
|
+| `token_len (int, optional)` | Length of input token | `4` |
|
|
|
+| `pool_mode (str, optional)` | Gets the pooling strategy for input tokens when `'use_tokenizer'` is set to False. `'max'` means global max pooling, `'avg'` means global average pooling | `'max'` |
|
|
|
+| `pool_size (int, optional)` | When `'use_tokenizer'` is set to False, the height and width of the pooled feature map | `2` |
|
|
|
+| `enc_with_pos (bool, optional)` | Whether to add learned positional embeddings to the encoder's input feature sequence | `True` |
|
|
|
+| `enc_depth (int, optional)` | Number of attention blocks used in encoder | `1` |
|
|
|
+| `enc_head_dim (int, optional)` | Embedding dimension of each encoder head | `64` |
|
|
|
+| `dec_depth (int, optional)` | Number of attention blocks used in decoder | `8` |
|
|
|
+| `dec_head_dim (int, optional)` | Embedding dimension for each decoder head | `8` |
|
|
|
+
|
|
|
+## `CDNet`
|
|
|
+
|
|
|
+The CDNet implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to Pablo F. Alcantarilla, et al., "Street-View Change Detection with Deconvolut ional Networks"(https://link.springer.com/article/10.1007/s10514-018-9734-5).
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|----------------------------------------------------------------------------------------------------| ---------- |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `6` |
|
|
|
+
|
|
|
+## `ChangeFormer`
|
|
|
+
|
|
|
+The ChangeFormer implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to Wele Gedara Chaminda Bandara,Vishal M. Patel,“A TRANSFORMER-BASED SIAMESE NETWORK FOR CHANGE DETECTION”(https://arxiv.org/pdf/2201.01293.pdf)。
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|--------------------------------|-----------------------------------------------------------------------------|--------------|
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `decoder_softmax (bool)` | Whether to use softmax as the last layer activation function of the decoder | `False` |
|
|
|
+| `embed_dim (int)` | Hidden layer dimension of the Transformer encoder | `256` |
|
|
|
+
|
|
|
+## `ChangeStar_FarSeg`
|
|
|
+
|
|
|
+The ChangeStar implementation with a FarSeg encoder based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to Z. Zheng, et al., "Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery"(https://arxiv.org/abs/2108.07002).
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|---------------------------------------------------------------------|-------------|
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `mid_channels (int)` | Number of channels in the middle layer of UNet | `256` |
|
|
|
+| `inner_channels (int)` | Number of channels inside the attention module | `16` |
|
|
|
+| `num_convs (int)` | Number of convolutional layers in UNet encoder and decoder | `4` |
|
|
|
+| `scale_factor (float)` | Upsampling factor to scale the size of the output segmentation mask | `4.0` |
|
|
|
+
|
|
|
+## `DSAMNet`
|
|
|
+
|
|
|
+The DSAMNet implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to Q. Shi, et al., "A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection"(https://ieeexplore.ieee.org/document/9467555).
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-----------------------|---------------------------------------------------------------------------------------------------------------------------------|-------|
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss | `False`|
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `ca_ratio (int)` | Channel compression ratio in channel attention module | `8` |
|
|
|
+| `sa_kernel (int)` | Kernel size in the spatial attention module | `7` |
|
|
|
+
|
|
|
+## `DSIFN`
|
|
|
+
|
|
|
+The DSIFN implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to C. Zhang, et al., "A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images"(https://www.sciencedirect.com/science/article/pii/S0924271620301532).
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-----------------------|----------------------------------------------------------------------------------------------------|-------|
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss | `False`|
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `use_dropout (bool)` | Whether to use dropout | `False`|
|
|
|
+
|
|
|
+## `FC-EF`
|
|
|
+
|
|
|
+The FC-EF implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462)`.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|---------------------------------------|-------|
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss | `False`|
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `6` |
|
|
|
+| `use_dropout (bool)` | Whether to use dropout | `False`|
|
|
|
+
|
|
|
+## `FC-Siam-conc`
|
|
|
+
|
|
|
+The FC-Siam-conc implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462).
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|---------------------------------------------------------------------------------------------------------------------------------|-------|
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss | `False`|
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `use_dropout (bool)` | Whether to use dropout | `False`|
|
|
|
+
|
|
|
+## `FC-Siam-diff`
|
|
|
+
|
|
|
+The FC-Siam-diff implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462).
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|--------------------------------------------------------------------------------------------------| --- |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function |`False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | int | `3` |
|
|
|
+| `use_dropout (bool)` | Whether to use dropout | `False` |
|
|
|
+
|
|
|
+## `FCCDN`
|
|
|
+
|
|
|
+The FCCDN implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to Pan Chen, et al., "FCCDN: Feature Constraint Network for VHR Image Change Detection"(https://arxiv.org/pdf/2105.10860.pdf).
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|--------------------------|---------------------------------------|-------|
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss | `False`|
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+
|
|
|
+## `P2V-CD`
|
|
|
+
|
|
|
+The P2V-CD implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to M. Lin, et al. "Transition Is a Process: Pair-to-Video Change Detection Networks for Very High Resolution Remote Sensing Images"(https://ieeexplore.ieee.org/document/9975266).
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|---------------------------------------|-------|
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss | `False`|
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `video_len (int)` | Number of input video frames | `8` |
|
|
|
+
|
|
|
+## `SNUNet`
|
|
|
+
|
|
|
+The SNUNet implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to S. Fang, et al., "SNUNet-CD: A Densely Connected Siamese Network for Change Detection of VHR Images" (https://ieeexplore.ieee.org/document/9355573).
|
|
|
+
|
|
|
+| arg_name | Description | default |
|
|
|
+|------------------------|-------------------------------------------------|------|
|
|
|
+| `in_channels (int)` | Number of channels of the input image | |
|
|
|
+| `num_classes(int)` | Number of target classes | |
|
|
|
+| `width (int,optional)` | Output channels of the first convolutional layer | 32 |
|
|
|
+
|
|
|
+## `STANet`
|
|
|
+
|
|
|
+The STANet implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to H. Chen and Z. Shi, "A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection"(https://www.mdpi.com/2072-4292/12/10/1662).
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|------------------------------------------| --- |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `width (int)` | Number of channels in the neural network | `32` |
|
|
|
+
|
|
|
+## `CondenseNetV2`
|
|
|
+
|
|
|
+The CondenseNetV2 implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|---------------------------------------------------------| --- |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `arch (str)` | Architecture of the model, can be `'A'`, `'B'` or `'C'` | `'A'` |
|
|
|
+
|
|
|
+## `HRNet`
|
|
|
+
|
|
|
+The HRNet implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|------------------------------------| --- |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+
|
|
|
+
|
|
|
+## `MobileNetV3`
|
|
|
+
|
|
|
+The MobileNetV3 implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------| --- | --- |
|
|
|
+| `num_classes (int)` | Number of target classes| `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+
|
|
|
+
|
|
|
+## `ResNet50-vd`
|
|
|
+
|
|
|
+The ResNet50-vd implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------| --- | --- |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+
|
|
|
+## `DRN`
|
|
|
+
|
|
|
+The DRN implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `sr_factor (int)` | Scaling factor for super-resolution, the size of the original image will be multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W`. | `4` |
|
|
|
+| `min_max (None \| tuple[float, float])` | Minimum and maximum image pixel values | `None` |
|
|
|
+| `scales (tuple[int])` | Scaling factor | `(2, 4)` |
|
|
|
+| `n_blocks (int)` | Number of residual blocks | `30` |
|
|
|
+| `n_feats (int)` | Number of features in the residual block | `16` |
|
|
|
+| `n_colors (int)` | Number of image channels | `3` |
|
|
|
+| `rgb_range (float)` | Range of image pixel values | `1.0` |
|
|
|
+| `negval (float)` | Negative value in nonlinear mapping | `0.2` |
|
|
|
+| `Supplementary Description of `lq_loss_weight` parameter (float)` | Weight of the low-quality image loss, which is used to control the impact of the reconstruction loss on the overall loss of restoring the low-resolution input image into a high-resolution output image. | `0.1` |
|
|
|
+| `dual_loss_weight (float)` | Weight of the bilateral loss | `0.1` |
|
|
|
+
|
|
|
+
|
|
|
+## `ESRGAN`
|
|
|
+
|
|
|
+The ESRGAN implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| --- |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `sr_factor (int)` | Scaling factor for super-resolution, the size of the original image will be multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W` | `4` |
|
|
|
+| `min_max (tuple)` | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used | `None` |
|
|
|
+| `use_gan (bool)` | Boolean indicating whether to use GAN (Generative Adversarial Network) during training. If yes, GAN will be used | `True` |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `out_channels (int)` | Number of channels of the output image | `3` |
|
|
|
+| `nf (int)` | Number of filters in the first convolutional layer of the model | `64` |
|
|
|
+| `nb (int)` | Number of residual blocks in the model | `23` |
|
|
|
+
|
|
|
+## `LESRCNN`
|
|
|
+
|
|
|
+The LESRCNN implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| --- |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `sr_factor (int)` | Scaling factor for super-resolution, the size of the original image will be multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W`. | `4` |
|
|
|
+| `min_max (tuple)` | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used. | `None` |
|
|
|
+| `multi_scale (bool)` | Boolean indicating whether to train on multiple scales. If yes, multiple scales are used during training. | `False` |
|
|
|
+| `group (int)` | Controls the number of groups for convolution operations. Standard convolution if set to `1`, DWConv if set to the number of input channels. | `1` |
|
|
|
+
|
|
|
+## `Faster R-CNN`
|
|
|
+
|
|
|
+The Faster R-CNN implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------------|------------------------------------------------------------------------------------------------------------| --- |
|
|
|
+| `num_classes (int)` | Number of target classes | `80` |
|
|
|
+| `backbone (str)` | Backbone network model to use | `'ResNet50'` |
|
|
|
+| `with_fpn (bool)` | Boolean indicating whether to use Feature Pyramid Network (FPN) | `True` |
|
|
|
+| `with_dcn (bool)` | Boolean indicating whether to use Deformable Convolutional Networks (DCN) | `False` |
|
|
|
+| `aspect_ratios (list)` | List of aspect ratios of candidate boxes | `[0.5, 1.0, 2.0]` |
|
|
|
+| `anchor_sizes (list)` | list of sizes of candidate boxes expressed as base sizes on each feature map | `[[32], [64], [128], [256], [512]]` |
|
|
|
+| `keep_top_k (int)` | Number of predicted boxes to keep before NMS operation | `100` |
|
|
|
+| `nms_threshold (float)` | Non-maximum suppression (NMS) threshold to use | `0.5` |
|
|
|
+| `score_threshold (float)` | Score threshold for filtering predicted boxes | `0.05` |
|
|
|
+| `fpn_num_channels (int)` | Number of channels for each pyramid layer in the FPN network | `256` |
|
|
|
+| `rpn_batch_size_per_im (int)` | Ratio of positive and negative samples per image in the RPN network | `256` |
|
|
|
+| `rpn_fg_fraction (float)` | Fraction of foreground samples in RPN network | `0.5` |
|
|
|
+| `test_pre_nms_top_n (int)` | Number of predicted boxes to keep before NMS operation when testing. If not specified, `keep_top_k` is used. | `None` |
|
|
|
+| `test_post_nms_top_n (int)` | Number of predicted boxes to keep after NMS operation at test time | `1000` |
|
|
|
+
|
|
|
+## `PP-YOLO`
|
|
|
+
|
|
|
+The PP-YOLO implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|----------------------------------|--------------------------------------------------------------------| --- |
|
|
|
+| `num_classes (int)` | Number of target classes | `80` |
|
|
|
+| `backbone (str)` | PPYOLO's backbone network | `'ResNet50_vd_dcn'` |
|
|
|
+| `anchors (list[list[float]])` | Size of predefined anchor boxes | `None` |
|
|
|
+| `anchor_masks (list[list[int]])` | Masks for predefined anchor boxes | `None` |
|
|
|
+| `use_coord_conv (bool)` | Whether to use coordinate convolution | `True` |
|
|
|
+| `use_iou_aware (bool)` | Whether to use IoU awareness | `True` |
|
|
|
+| `use_spp (bool)` | Whether to use spatial pyramid pooling (SPP) | `True` |
|
|
|
+| `use_drop_block (bool)` | Whether to use DropBlock regularization | `True` |
|
|
|
+| `scale_x_y (float)` | Parameter to scale each predicted box | `1.05` |
|
|
|
+| `ignore_threshold (float)` | IoU threshold used to assign predicted boxes to ground truth boxes | `0.7` |
|
|
|
+| `label_smooth (bool)` | Whether to use label smoothing | `False` |
|
|
|
+| `use_iou_loss (bool)` | Whether to use IoU Loss | `True` |
|
|
|
+| `use_matrix_nms (bool)` | Whether to use Matrix NMS | `True` |
|
|
|
+| `nms_score_threshold (float)` | NMS score threshold | `0.01` |
|
|
|
+| `nms_topk (int)` | Maximum number of detections to keep before performing NMS | `-1` |
|
|
|
+| `nms_keep_topk (int)` | Maximum number of prediction boxes to keep after NMS | `100`|
|
|
|
+| `nms_iou_threshold (float)` | NMS IoU threshold | `0.45` |
|
|
|
+
|
|
|
+## `PP-YOLO Tiny`
|
|
|
+
|
|
|
+The PP-YOLO Tiny implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|----------------------------------|-------------------------------------------------------------| --- |
|
|
|
+| `num_classes (int)` | Number of target classes | `80` |
|
|
|
+| `backbone (str)` | Backbone network model name to use | `'MobileNetV3'` |
|
|
|
+| `anchors (list[list[float]])` | List of anchor box sizes | `[[10, 15], [24, 36], [72, 42], [35, 87], [102, 96] , [60, 170], [220, 125], [128, 222], [264, 266]]` |
|
|
|
+| `anchor_masks (list[list[int]])` | Anchor box mask | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
|
|
|
+| `use_iou_aware (bool)` | Boolean value indicating whether to use IoU-aware loss | `False` |
|
|
|
+| `use_spp (bool)` | Boolean indicating whether to use the SPP module | `True` |
|
|
|
+| `use_drop_block (bool)` | Boolean value indicating whether to use the DropBlock block | `True` |
|
|
|
+| `scale_x_y (float)` | Scaling parameter | `1.05` |
|
|
|
+| `ignore_threshold (float)` | Ignore threshold | `0.5` |
|
|
|
+| `label_smooth (bool)` | Boolean indicating whether to use label smoothing | `False` |
|
|
|
+| `use_iou_loss (bool)` | Boolean value indicating whether to use IoU Loss | `True` |
|
|
|
+| `use_matrix_nms (bool)` | Boolean indicating whether to use Matrix NMS | `False` |
|
|
|
+| `nms_score_threshold (float)` | NMS score threshold | `0.005` |
|
|
|
+| `nms_topk (int)` | Number of bounding boxes to keep before NMS operation | `1000` |
|
|
|
+| `nms_keep_topk (int)` | Number of bounding boxes to keep after NMS operation | `100` |
|
|
|
+| `nms_iou_threshold (float)` | NMS IoU threshold | `0.45` |
|
|
|
+
|
|
|
+## `PP-YOLOv2`
|
|
|
+
|
|
|
+The PP-YOLOv2 implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|----------------------------------| --- | --- |
|
|
|
+| `num_classes (int)` | Number of target classes | `80` |
|
|
|
+| `backbone (str)` | PPYOLO's backbone network | `'ResNet50_vd_dcn'` |
|
|
|
+| `anchors (list[list[float]])` | Sizes of predefined anchor boxes| `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]]` |
|
|
|
+| `anchor_masks (list[list[int]])` | Masks of predefined anchor boxes | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
|
|
|
+| `use_iou_aware (bool)` | Whether to use IoU awareness | `True` |
|
|
|
+| `use_spp (bool)` | Whether to use spatial pyramid pooling (SPP) | `True` |
|
|
|
+| `use_drop_block (bool)` | Whether to use DropBlock regularization | `True` |
|
|
|
+| `scale_x_y (float)` | Parameter to scale each predicted box | `1.05` |
|
|
|
+| `ignore_threshold (float)` | IoU threshold used to assign predicted boxes to ground truth boxes | `0.7` |
|
|
|
+| `label_smooth (bool)` | Whether to use label smoothing | `False` |
|
|
|
+| `use_iou_loss (bool)` | Whether to use IoU Loss | `True` |
|
|
|
+| `use_matrix_nms (bool)` | Whether to use Matrix NMS | `True` |
|
|
|
+| `nms_score_threshold (float)` | NMS score threshold | `0.01` |
|
|
|
+| `nms_topk (int)` | Maximum number of detections to keep before performing NMS | `-1` |
|
|
|
+| `nms_keep_topk (int)` | Maximum number of prediction boxes to keep after NMS | `100`|
|
|
|
+| `nms_iou_threshold (float)` | NMS IoU threshold | `0.45` |
|
|
|
+
|
|
|
+## `YOLOv3`
|
|
|
+
|
|
|
+The YOLOv3 implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+| --- |-----------------------------------------------------------------------------------------------------------------------------| --- |
|
|
|
+| `num_classes (int)` | Number of target classes | `80` |
|
|
|
+| `backbone (str)` | Name of the feature extraction network | `'MobileNetV1'` |
|
|
|
+| `anchors (list[list[int]])` | Sizes of all anchor boxes | `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45 ], [59, 119], [116, 90], [156, 198], [373, 326]]` |
|
|
|
+| `anchor_masks (list[list[int]])` | Which anchor boxes to use to predict the target box | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` |
|
|
|
+| `ignore_threshold (float)` | IoU threshold of the predicted box and the ground truth box, below which the threshold will be considered as the background | `0.7` |
|
|
|
+| `nms_score_threshold (float)` | In non-maximum suppression, score threshold below which boxes will be discarded | `0.01` |
|
|
|
+| `nms_topk (int)` | In non-maximum value suppression, the maximum number of scoring boxes to keep, if it is -1, all boxes are kept | `1000` |
|
|
|
+| `nms_keep_topk (int)` | In non-maximum value suppression, the maximum number of boxes to keep per image | `100` |
|
|
|
+| `nms_iou_threshold (float)` | In non-maximum value suppression, IoU threshold, boxes larger than this threshold will be discarded | `0.45` |
|
|
|
+| `label_smooth (bool)` | Whether to use label smoothing when computing loss | `False` |
|
|
|
+
|
|
|
+## `BiSeNet V2`
|
|
|
+
|
|
|
+The BiSeNet V2 implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------| --- |---------------|
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `{}` |
|
|
|
+| `align_corners (bool)` | Whether to use the corner alignment method | `False` |
|
|
|
+
|
|
|
+## `DeepLab V3+`
|
|
|
+
|
|
|
+The DeepLab V3+ implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|----------------------------|--------------------------------------------------------------------------------| --- |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `backbone (str)` | Backbone network type of neural network | `ResNet50_vd` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `output_stride (int)` | Downsampling ratio of the output feature map relative to the input feature map | `8` |
|
|
|
+| `backbone_indices (tuple)` | Output the location indices of different stages of the backbone network | `(0, 3)` |
|
|
|
+| `aspp_ratios (tuple)` | Dilation ratio of dilated convolution | `(1, 12, 24, 36)` |
|
|
|
+| `aspp_out_channels (int)` | Number of ASPP module output channels | `256` |
|
|
|
+| `align_corners (bool)` | Whether to use the corner alignment method | `False` |
|
|
|
+
|
|
|
+
|
|
|
+## `FactSeg`
|
|
|
+
|
|
|
+The FactSeg implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to A. Ma, J. Wang, Y. Zhong and Z. Zheng, "FactSeg: Foreground Activation -Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery,"in IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-16, 2022, Art no. 5606216.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|------------------------------------------------------------------------------------------------------------------| --- |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+
|
|
|
+## `FarSeg`
|
|
|
+
|
|
|
+The FarSeg implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+The original article refers to Zheng Z, Zhong Y, Wang J, et al. Foreground-aware relation network for geospatial object segmentation in high spatial resolution remote sensing imagery[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 4096-4105.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|-----------------------------------------------------------------------------------------------------------------| --- |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+
|
|
|
+## `Fast-SCNN`
|
|
|
+
|
|
|
+The Fast-SCNN implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|------------------------------------------------|----------------------|
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `align_corners (bool)` | Whether to use the corner alignment method | `False` |
|
|
|
+
|
|
|
+
|
|
|
+## `HRNet`
|
|
|
+
|
|
|
+The HRNet implementation based on PaddlePaddle.
|
|
|
+
|
|
|
+| Parameter Name | Description | Default Value |
|
|
|
+|-------------------------|------------------------------------------------------------------------------------------------------------------| --- |
|
|
|
+| `in_channels (int)` | Number of channels of the input image | `3` |
|
|
|
+| `num_classes (int)` | Number of target classes | `2` |
|
|
|
+| `width (int)` | Initial number of channels for the network | `48` |
|
|
|
+| `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` |
|
|
|
+| `losses (list)` | List of loss functions | `None` |
|
|
|
+| `align_corners (bool)` | Whether to use the corner alignment method | `False` |
|