[简体中文](model_cons_param_cn.md) | English # PaddleRS Model Construction Parameters This document describes the construction parameters of each PaddleRS model trainer, including their parameter names, parameter types, parameter descriptions, and default values. ## `BIT` The BIT implementation based on PaddlePaddle. The original article refers to H. Chen, et al., "Remote Sensing Image Change Detection With Transformers "(https://arxiv.org/abs/2103.00208). This implementation adopts pretrained encoders, as opposed to the original work where weights are randomly initialized. | Parameter Name | Description | Default Value | |-----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|-------------| | `in_channels (int)` | Number of channels of the input image | `3` | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | | `att_type (str)` | Spatial attention type values are `'CBAM'` and `'BAM'` | `'CBAM'` | | `ds_factor (int)` | Downsampling factor | `1` | | `backbone (str)` | ResNet architecture to use as backbone. Currently only `'resnet18'` and `'resnet34'` are supported | `'resnet18'` | | `n_stages (int)` | Number of ResNet stages used in the backbone, should be a value in `{3, 4, 5}` | `4` | | `use_tokenizer (bool)` | Whether to use tokenizer | `True` | | `token_len (int)` | Length of input token | `4` | | `pool_mode (str)` | Gets the pooling strategy for input tokens when `'use_tokenizer'` is set to False. `'max'` means global max pooling, `'avg'` means global average pooling | `'max'` | | `pool_size (int)` | When `'use_tokenizer'` is set to False, the height and width of the pooled feature map | `2` | | `enc_with_pos (bool)` | Whether to add learned positional embeddings to the encoder's input feature sequence | `True` | | `enc_depth (int)` | Number of attention blocks used in encoder | `1` | | `enc_head_dim (int)` | Embedding dimension of each encoder head | `64` | | `dec_depth (int)` | Number of attention blocks used in decoder | `8` | | `dec_head_dim (int)` | Embedding dimension for each decoder head | `8` | ## `CDNet` The CDNet implementation based on PaddlePaddle. The original article refers to Pablo F. Alcantarilla, et al., "Street-View Change Detection with Deconvolut ional Networks"(https://link.springer.com/article/10.1007/s10514-018-9734-5). | Parameter Name | Description | Default Value | |-------------------------|----------------------------------------------------------------------------------------------------| ---------- | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | | `in_channels (int)` | Number of channels of the input image | `6` | ## `ChangeFormer` The ChangeFormer implementation based on PaddlePaddle. The original article refers to Wele Gedara Chaminda Bandara, Vishal M. Patel, “A TRANSFORMER-BASED SIAMESE NETWORK FOR CHANGE DETECTION”(https://arxiv.org/pdf/2201.01293.pdf). | Parameter Name | Description | Default Value | |--------------------------------|-----------------------------------------------------------------------------|--------------| | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss | `False` | | `losses (list)` | List of loss functions | `None` | | `in_channels (int)` | Number of channels of the input image | `3` | | `decoder_softmax (bool)` | Whether to use softmax as the last layer activation function of the decoder | `False` | | `embed_dim (int)` | Hidden layer dimension of the Transformer encoder | `256` | ## `ChangeStar` The ChangeStar implementation with a FarSeg encoder based on PaddlePaddle. The original article refers to Z. Zheng, et al., "Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery"(https://arxiv.org/abs/2108.07002). | Parameter Name | Description | Default Value | |-------------------------|---------------------------------------------------------------------|-------------| | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss | `False` | | `losses (list)` | List of loss functions | `None` | | `mid_channels (int)` | Number of channels in the middle layer of UNet | `256` | | `inner_channels (int)` | Number of channels inside the attention module | `16` | | `num_convs (int)` | Number of convolutional layers in UNet encoder and decoder | `4` | | `scale_factor (float)` | Upsampling factor to scale the size of the output segmentation mask | `4.0` | ## `DSAMNet` The DSAMNet implementation based on PaddlePaddle. The original article refers to Q. Shi, et al., "A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection"(https://ieeexplore.ieee.org/document/9467555). | Parameter Name | Description | Default Value | |-----------------------|---------------------------------------------------------------------------------------------------------------------------------|-------| | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss | `False`| | `losses (list)` | List of loss functions | `None` | | `in_channels (int)` | Number of channels of the input image | `3` | | `ca_ratio (int)` | Channel compression ratio in channel attention module | `8` | | `sa_kernel (int)` | Kernel size in the spatial attention module | `7` | ## `DSIFN` The DSIFN implementation based on PaddlePaddle. The original article refers to C. Zhang, et al., "A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images"(https://www.sciencedirect.com/science/article/pii/S0924271620301532). | Parameter Name | Description | Default Value | |-----------------------|----------------------------------------------------------------------------------------------------|-------| | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss | `False`| | `losses (list)` | List of loss functions | `None` | | `use_dropout (bool)` | Whether to use dropout | `False`| ## `FCEarlyFusion` The FC-EF implementation based on PaddlePaddle. The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462)`. | Parameter Name | Description | Default Value | |-------------------------|---------------------------------------|-------| | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss | `False`| | `losses (list)` | List of loss functions | `None` | | `in_channels (int)` | Number of channels of the input image | `6` | | `use_dropout (bool)` | Whether to use dropout | `False`| ## `FCSiamConc` The FC-Siam-conc implementation based on PaddlePaddle. The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462). | Parameter Name | Description | Default Value | |-------------------------|---------------------------------------------------------------------------------------------------------------------------------|-------| | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss | `False`| | `losses (list)` | List of loss functions | `None` | | `in_channels (int)` | Number of channels of the input image | `3` | | `use_dropout (bool)` | Whether to use dropout | `False`| ## `FCSiamDiff` The FC-Siam-diff implementation based on PaddlePaddle. The original article refers to Rodrigo Caye Daudt, et al. "Fully convolutional siamese networks for change detection"(https://arxiv.org/abs/1810.08462). | Parameter Name | Description | Default Value | |-------------------------|--------------------------------------------------------------------------------------------------| --- | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function |`False` | | `losses (list)` | List of loss functions | `None` | | `in_channels (int)` | Number of channels of the input image | int | `3` | | `use_dropout (bool)` | Whether to use dropout | `False` | ## `FCCDN` The FCCDN implementation based on PaddlePaddle. The original article refers to Pan Chen, et al., "FCCDN: Feature Constraint Network for VHR Image Change Detection"(https://arxiv.org/pdf/2105.10860.pdf). | Parameter Name | Description | Default Value | |--------------------------|---------------------------------------|-------| | `in_channels (int)` | Number of channels of the input image | `3` | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss | `False`| | `losses (list)` | List of loss functions | `None` | ## `P2V` The P2V-CD implementation based on PaddlePaddle. The original article refers to M. Lin, et al. "Transition Is a Process: Pair-to-Video Change Detection Networks for Very High Resolution Remote Sensing Images"(https://ieeexplore.ieee.org/document/9975266). | Parameter Name | Description | Default Value | |-------------------------|---------------------------------------|-------| | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss | `False`| | `losses (list)` | List of loss functions | `None` | | `in_channels (int)` | Number of channels of the input image | `3` | | `video_len (int)` | Number of input video frames | `8` | ## `SNUNet` The SNUNet implementation based on PaddlePaddle. The original article refers to S. Fang, et al., "SNUNet-CD: A Densely Connected Siamese Network for Change Detection of VHR Images" (https://ieeexplore.ieee.org/document/9355573). | arg_name | Description | default | |------------------------|-------------------------------------------------|------| | `in_channels (int)` | Number of channels of the input image | | | `num_classes (int)` | Number of target classes | | | `width (int)` | Output channels of the first convolutional layer | 32 | ## `STANet` The STANet implementation based on PaddlePaddle. The original article refers to H. Chen and Z. Shi, "A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection"(https://www.mdpi.com/2072-4292/12/10/1662). | Parameter Name | Description | Default Value | |-------------------------|------------------------------------------| --- | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss | `False` | | `losses (list)` | List of loss functions | `None` | | `in_channels (int)` | Number of channels of the input image | `3` | | `width (int)` | Number of channels in the neural network | `32` | ## `CondenseNetV2` The CondenseNetV2 implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |-------------------------|---------------------------------------------------------| --- | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | | `in_channels (int)` | Number of channels of the input image | `3` | | `arch (str)` | Architecture of the model, which can be `'A'`, `'B'`, or `'C'` | `'A'` | ## `HRNet` The HRNet implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |-------------------------|------------------------------------| --- | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | ## `MobileNetV3` The MobileNetV3 implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |-------------------------| --- | --- | | `num_classes (int)` | Number of target classes| `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | ## `ResNet50_vd` The ResNet50-vd implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |-------------------------| --- | --- | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | ## `DRN` The DRN implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |-------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------| | `losses (list)` | List of loss functions | `None` | | `sr_factor (int)` | Scaling factor for super-resolution. The output image size will be the original image size multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W` | `4` | | `min_max (None \| tuple[float, float])` | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used | `None` | | `scales (tuple[int])` | Scaling factor | `(2, 4)` | | `n_blocks (int)` | Number of residual blocks | `30` | | `n_feats (int)` | Number of features in the residual block | `16` | | `n_colors (int)` | Number of image channels | `3` | | `rgb_range (float)` | Range of image pixel values | `1.0` | | `negval (float)` | Negative value in nonlinear mapping | `0.2` | | `Supplementary Description of `lq_loss_weight` parameter (float)` | Weight of the primal regression loss | `0.1` | | `dual_loss_weight (float)` | Weight of the dual regression loss | `0.1` | ## `ESRGAN` The ESRGAN implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| --- | | `losses (list)` | List of loss functions | `None` | | `sr_factor (int)` | Scaling factor for super-resolution. The output image size will be the original image size multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W` | `4` | | `min_max (tuple)` | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used | `None` | | `use_gan (bool)` | Whether to use GAN (Generative Adversarial Network) during training. If yes, GAN will be used | `True` | | `in_channels (int)` | Number of channels of the input image | `3` | | `out_channels (int)` | Number of channels of the output image | `3` | | `nf (int)` | Number of filters in the first convolutional layer of the model | `64` | | `nb (int)` | Number of residual blocks in the model | `23` | ## `LESRCNN` The LESRCNN implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| --- | | `losses (list)` | List of loss functions | `None` | | `sr_factor (int)` | Scaling factor for super-resolution. The output image size will be the original image size multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W` | `4` | | `min_max (tuple)` | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used | `None` | | `multi_scale (bool)` | Whether to train on multiple scales. If yes, multiple scales are used during training | `False` | | `group (int)` | Number of groups used in convolution operations. | `1` | ## `NAFNet` The NAFNet implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| --- | | `losses (list)` | List of loss functions | `None` | | `sr_factor (int)` | Scaling factor for image restoration. NAFNet is not suitable for image super-resolution tasks and does not change the size of the image. Please set the `sr factor` to `None` | `None` | | `min_max (tuple)` | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used | `None` | | `use_tlsc (bool)` | Whether to use tlsc (test-time local statistics converter) during testing. If yes, tlsc will be used | `False` | | `in_channels (int)` | Number of channels of the input image | `3` | | `width (int)` | Number of channels of NAFBlock | `32` | | `middle_blk_num (int)` | Number of NAFBlocks in middle block | `1` | | `enc_blk_nums (list[int])` | Number of NAFBlocks in different layers of the encoder | `None` | | `dec_blk_nums (list[int])` | Number of NAFBlocks in different layers of the decoder | `None` | ## `SwinIR` The SwinIR implementation based on PaddlePaddle. | 参数名 | 描述 | 默认值 | |----------------------|-----------------------------------------------------------------------------------------| --- | | `losses (list)` | List of loss functions | `None` | | `sr_factor (int)` | Scaling factor for image restoration. The output image size will be the original image size multiplied by this factor. For example, if the original image is `H` x `W`, the output image will be `sr_factor * H` x `sr_factor * W` | `1` | | `min_max (tuple)` | Minimum and maximum pixel values of the input image. If not specified, the data type's default minimum and maximum values are used | `None` | | `in_channels (int)` | Number of channels of the input image | `3` | | `img_size (int)` | Input image size | `128` | | `window_size (int)` | Window size | `8` | | `depths (list[int])` | Depth of each Swin Transformer layer | `[6, 6, 6, 6, 6, 6]` | | `num_heads (list[int])` | Number of attention heads in different layers | `[6, 6, 6, 6]` | | `embed_dim (int)` | Patch embedding dimension | `96` | | `window_size (int)` | Ratio of MLP hidden dim to embedding dim | `4` | ## `FasterRCNN` The Faster R-CNN implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |-------------------------------|------------------------------------------------------------------------------------------------------------| --- | | `num_classes (int)` | Number of target classes | `80` | | `backbone (str)` | Backbone network to use | `'ResNet50'` | | `with_fpn (bool)` | Whether to use Feature Pyramid Network (FPN) | `True` | | `with_dcn (bool)` | Whether to use Deformable Convolutional Networks (DCN) | `False` | | `aspect_ratios (list)` | List of aspect ratios of candidate boxes | `[0.5, 1.0, 2.0]` | | `anchor_sizes (list)` | list of sizes of candidate boxes expressed as base sizes on each feature map | `[[32], [64], [128], [256], [512]]` | | `keep_top_k (int)` | Number of predicted boxes to keep before the non-maximum suppression (NMS) operation | `100` | | `nms_threshold (float)` | NMS threshold to use | `0.5` | | `score_threshold (float)` | Score threshold for filtering predicted boxes | `0.05` | | `fpn_num_channels (int)` | Number of channels for each pyramid layer in the FPN network | `256` | | `rpn_batch_size_per_im (int)` | Ratio of positive and negative samples per image in the RPN network | `256` | | `rpn_fg_fraction (float)` | Fraction of foreground samples in RPN network | `0.5` | | `test_pre_nms_top_n (int)` | Number of predicted boxes to keep before NMS operation when testing. If not specified, `keep_top_k` is used. | `None` | | `test_post_nms_top_n (int)` | Number of predicted boxes to keep after NMS operation at test time | `1000` | ## `FCOSR` The FCOSR implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | | --- |-----------------------------------------------------------------------------------------------------------------------------| --- | | `num_classes (int)` | Number of target classes | `80` | | `backbone (str)` | Backbone network to use | `'MobileNetV1'` | | `anchors (list[list[int]])` | Sizes of predefined anchor boxes | `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45 ], [59, 119], [116, 90], [156, 198], [373, 326]]` | | `anchor_masks (list[list[int]])` | Masks of predefined anchor boxes | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` | | `ignore_threshold (float)` | IoU threshold used to assign predicted boxes to ground truth boxes | `0.7` | | `nms_score_threshold (float)` | NMS score threshold | `0.01` | | `nms_topk (int)` | Maximum number of detections to keep before performing NMS | `1000` | | `nms_keep_topk (int)` | Maximum number of prediction boxes to keep after NMS | `100` | | `nms_iou_threshold (float)` | NMS IoU threshold | `0.45` | | `label_smooth (bool)` | Whether to use label smoothing when computing losses ## `PPYOLO` The PP-YOLO implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |----------------------------------|--------------------------------------------------------------------| --- | | `num_classes (int)` | Number of target classes | `80` | | `backbone (str)` | Backbone network to use | `'ResNet50_vd_dcn'` | | `anchors (list[list[float]])` | Sizes of predefined anchor boxes | `None` | | `anchor_masks (list[list[int]])` | Masks for predefined anchor boxes | `None` | | `use_coord_conv (bool)` | Whether to use coordinate convolution | `True` | | `use_iou_aware (bool)` | Whether to use IoU awareness | `True` | | `use_spp (bool)` | Whether to use spatial pyramid pooling (SPP) | `True` | | `use_drop_block (bool)` | Whether to use DropBlock | `True` | | `scale_x_y (float)` | Parameter to scale each predicted box | `1.05` | | `ignore_threshold (float)` | IoU threshold used to assign predicted boxes to ground truth boxes | `0.7` | | `label_smooth (bool)` | Whether to use label smoothing | `False` | | `use_iou_loss (bool)` | Whether to use IoU loss | `True` | | `use_matrix_nms (bool)` | Whether to use Matrix NMS | `True` | | `nms_score_threshold (float)` | NMS score threshold | `0.01` | | `nms_topk (int)` | Maximum number of detections to keep before performing NMS | `-1` | | `nms_keep_topk (int)` | Maximum number of prediction boxes to keep after NMS | `100`| | `nms_iou_threshold (float)` | NMS IoU threshold | `0.45` | ## `PPYOLOTiny` The PP-YOLO Tiny implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |----------------------------------|-------------------------------------------------------------| --- | | `num_classes (int)` | Number of target classes | `80` | | `backbone (str)` | Backbone network to use | `'MobileNetV3'` | | `anchors (list[list[float]])` | Sizes of predefined anchor boxes | `[[10, 15], [24, 36], [72, 42], [35, 87], [102, 96] , [60, 170], [220, 125], [128, 222], [264, 266]]` | | `anchor_masks (list[list[int]])` | Masks for predefined anchor boxes | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` | | `use_iou_aware (bool)` | Whether to use IoU awareness | `False` | | `use_spp (bool)` | Whether to use spatial pyramid pooling (SPP) | `True` | | `use_drop_block (bool)` | Whether to use the DropBlock | `True` | | `scale_x_y (float)` | Parameter to scale each predicted box | `1.05` | | `ignore_threshold (float)` | IoU threshold used to assign predicted boxes to ground truth boxes | `0.5` | | `label_smooth (bool)` | Whether to use label smoothing | `False` | | `use_iou_loss (bool)` | Whether to use IoU loss | `True` | | `use_matrix_nms (bool)` | Whether to use Matrix NMS | `False` | | `nms_score_threshold (float)` | NMS score threshold | `0.005` | | `nms_topk (int)` | Maximum number of detections to keep before performing NMS | `1000` | | `nms_keep_topk (int)` | Maximum number of prediction boxes to keep after NMS | `100` | | `nms_iou_threshold (float)` | NMS IoU threshold | `0.45` | ## `PPYOLOv2` The PP-YOLOv2 implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |----------------------------------| --- | --- | | `num_classes (int)` | Number of target classes | `80` | | `backbone (str)` | Backbone network to use | `'ResNet50_vd_dcn'` | | `anchors (list[list[float]])` | Sizes of predefined anchor boxes| `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]]` | | `anchor_masks (list[list[int]])` | Masks of predefined anchor boxes | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` | | `use_iou_aware (bool)` | Whether to use IoU awareness | `True` | | `use_spp (bool)` | Whether to use spatial pyramid pooling (SPP) | `True` | | `use_drop_block (bool)` | Whether to use DropBlock | `True` | | `scale_x_y (float)` | Parameter to scale each predicted box | `1.05` | | `ignore_threshold (float)` | IoU threshold used to assign predicted boxes to ground truth boxes | `0.7` | | `label_smooth (bool)` | Whether to use label smoothing | `False` | | `use_iou_loss (bool)` | Whether to use IoU loss | `True` | | `use_matrix_nms (bool)` | Whether to use Matrix NMS | `True` | | `nms_score_threshold (float)` | NMS score threshold | `0.01` | | `nms_topk (int)` | Maximum number of detections to keep before performing NMS | `-1` | | `nms_keep_topk (int)` | Maximum number of prediction boxes to keep after NMS | `100`| | `nms_iou_threshold (float)` | NMS IoU threshold | `0.45` | ## `YOLOv3` The YOLOv3 implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | | --- |-----------------------------------------------------------------------------------------------------------------------------| --- | | `num_classes (int)` | Number of target classes | `80` | | `backbone (str)` | Backbone network to use | `'MobileNetV1'` | | `anchors (list[list[int]])` | Sizes of predefined anchor boxes | `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45 ], [59, 119], [116, 90], [156, 198], [373, 326]]` | | `anchor_masks (list[list[int]])` | Masks of predefined anchor boxes | `[[6, 7, 8], [3, 4, 5], [0, 1, 2]]` | | `ignore_threshold (float)` | IoU threshold used to assign predicted boxes to ground truth boxes | `0.7` | | `nms_score_threshold (float)` | NMS score threshold | `0.01` | | `nms_topk (int)` | Maximum number of detections to keep before performing NMS | `1000` | | `nms_keep_topk (int)` | Maximum number of prediction boxes to keep after NMS | `100` | | `nms_iou_threshold (float)` | NMS IoU threshold | `0.45` | | `label_smooth (bool)` | Whether to use label smoothing when computing losses | `False` | ## `BiSeNetV2` The BiSeNet V2 implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |-------------------------| --- |---------------| | `in_channels (int)` | Number of channels of the input image | `3` | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `{}` | | `align_corners (bool)` | Whether to use the corner alignment method | `False` | ## `DeepLabV3P` The DeepLab V3+ implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |----------------------------|--------------------------------------------------------------------------------| --- | | `in_channels (int)` | Number of channels of the input image | `3` | | `num_classes (int)` | Number of target classes | `2` | | `backbone (str)` | Backbone network type of neural network | `ResNet50_vd` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | | `output_stride (int)` | Downsampling ratio of the output feature map relative to the input feature map | `8` | | `backbone_indices (tuple)` | Indices of different stages of the backbone network for use | `(0, 3)` | | `aspp_ratios (tuple)` | Dilation ratio of dilated convolution | `(1, 12, 24, 36)` | | `aspp_out_channels (int)` | Number of ASPP module output channels | `256` | | `align_corners (bool)` | Whether to use the corner alignment method | `False` | ## `FactSeg` The FactSeg implementation based on PaddlePaddle. The original article refers to A. Ma, J. Wang, Y. Zhong and Z. Zheng, "FactSeg: Foreground Activation -Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery,"in IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-16, 2022, Art no. 5606216. | Parameter Name | Description | Default Value | |-------------------------|------------------------------------------------------------------------------------------------------------------| --- | | `in_channels (int)` | Number of channels of the input image | `3` | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | ## `FarSeg` The FarSeg implementation based on PaddlePaddle. The original article refers to Zheng Z, Zhong Y, Wang J, et al. Foreground-aware relation network for geospatial object segmentation in high spatial resolution remote sensing imagery[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 4096-4105. | Parameter Name | Description | Default Value | |-------------------------|-----------------------------------------------------------------------------------------------------------------| --- | | `in_channels (int)` | Number of channels of the input image | `3` | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | ## `FastSCNN` The Fast-SCNN implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |-------------------------|------------------------------------------------|----------------------| | `in_channels (int)` | Number of channels of the input image | `3` | | `num_classes (int)` | Number of target classes | `2` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | | `align_corners (bool)` | Whether to use the corner alignment method | `False` | ## `HRNet` The HRNet implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |-------------------------|------------------------------------------------------------------------------------------------------------------| --- | | `in_channels (int)` | Number of channels of the input image | `3` | | `num_classes (int)` | Number of target classes | `2` | | `width (int)` | Initial number of feature channels for the network | `48` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | | `align_corners (bool)` | Whether to use the corner alignment method | `False` | ## `UNet` The UNet implementation based on PaddlePaddle. | Parameter Name | Description | Default Value | |-------------------------|------------------------------------------------------------------------------------------------------------------| --- | | `in_channels (int)` | Number of channels of the input image | `3` | | `num_classes (int)` | Number of target classes | `2` | | `use_deconv (int)` | Whether to use deconvolution for upsampling | `48` | | `use_mixed_loss (bool)` | Whether to use mixed loss function | `False` | | `losses (list)` | List of loss functions | `None` | | `align_corners (bool)` | Whether to use the corner alignment method | `False` |