Kaynağa Gözat

【Hackathon + No.151】Add Dockerfiles and Building Extraction Example (#147)

Yizhou Chen 2 yıl önce
ebeveyn
işleme
389330b395

+ 1 - 1
.gitignore

@@ -139,4 +139,4 @@ dmypy.json
 /tutorials/train/**/output/
 /log
 
-/playground/
+/playground/

+ 21 - 2
Dockerfile

@@ -24,5 +24,24 @@ WORKDIR /usr/src
 RUN pip install git+https://github.com/lucasb-eyer/pydensecrf.git \
 	&& rm -rf /usr/src/pydensecrf
 
-# 6. finish
-WORKDIR /opt/PaddleRS
+# 6. (optional) install eiseg
+ARG EISEG
+RUN if [ "$EISEG" = "ON" ] ; then \
+	pip install --upgrade pip \
+	&& pip install eiseg rasterio -i https://mirror.baidu.com/pypi/simple \
+	&& pip uninstall -y opencv-python-headless \
+	&& pip install opencv-python==4.2.0.34 -i https://mirror.baidu.com/pypi/simple \
+	&& apt-get update \
+	&& apt-get install -y \
+	libgl1-mesa-glx libxcb-xinerama0 libxkbcommon-x11-0 \
+	libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-randr0 \
+	libxcb-render-util0 libxcb-shape0 libxcb-xfixes0 \
+	x11-xserver-utils x11-apps locales \
+	&& locale-gen zh_CN \
+	&& locale-gen zh_CN.utf8 \
+	&& apt-get install -y ttf-wqy-microhei ttf-wqy-zenhei xfonts-wqy ; \
+	fi
+ENV DISPLAY host.docker.internal:0
+
+# 7. finish
+WORKDIR /opt/PaddleRS

+ 47 - 0
docs/docker_cn.md

@@ -0,0 +1,47 @@
+# PaddleRS镜像构建与使用
+
+## 1. 镜像构建
+
+首先需要拉取仓库:
+
+```shell
+git clone https://github.com/PaddlePaddle/PaddleRS
+```
+
+- 安装CPU版本,默认为2.4.1:
+
+```shell
+docker build -t paddlers:latest -f Dockerfile .
+```
+
+- (可选)安装GPU版本,若要使用PaddleRS进行训练,最好使用GPU版本,请确保Docker版本大于19,其他环境的`PPTAG`可以参考[https://hub.docker.com/r/paddlepaddle/paddle/tags](https://hub.docker.com/r/paddlepaddle/paddle/tags):
+
+```shell
+docker build -t paddlers:latest -f Dockerfile . --build-arg PPTAG=2.4.1-gpu-cuda11.7-cudnn8.4-trt8.4
+```
+
+- (可选)若需要使用[EISeg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop/EISeg)提供的交互式分割标注功能,可设置`EISEG="ON"`,默认只安装了支持遥感标注的扩展:
+
+```shell
+docker build -t paddlers:latest -f Dockerfile . --build-arg EISEG="ON"
+```
+
+## 2. 镜像使用
+
+- 查看当前构建好的镜像,记住需要启动的镜像的`<imageID>`:
+
+```shell
+docker images
+```
+
+- 仅使用PaddleRS(包括EISeg),可直接启动镜像,将本机存放模型参数的文件夹挂载到docker中,若要使用GPU,在docker 19之后,可以添加中括号内的参数启用GPU:
+
+```shell
+docker run -it -v <本机文件夹绝对路径:容器文件夹绝对路径> [--gpus all -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all] <imageID>
+```
+
+- (可选)若需要使用EISeg,则需要在本机安装和开启X11,用于接收Qt的GUI界面。Windows可使用[VcXsrv](https://sourceforge.net/projects/vcxsrv/),Linux可使用[Xserver](https://blog.csdn.net/a806689294/article/details/111462627)。在相关工具启动之后,再启动EISeg:
+
+```shell
+eiseg
+```

+ 47 - 0
docs/docker_en.md

@@ -0,0 +1,47 @@
+# PaddleRS Image Build and Use
+
+## 1. Image Build
+
+First, you need to clone the repository:
+
+```shell
+git clone https://github.com/PaddlePaddle/PaddleRS
+```
+
+- Install the CPU version, which is 2.4.1 by default:
+
+```shell
+docker build -t paddlers:latest -f Dockerfile .
+```
+
+- (Optional) Install the GPU version. If you want to use PaddleRS for training, it is recommended to use the GPU version. Make sure that the Docker version is greater than 19. For other environments, the `PPTAG` can refer to [https://hub.docker.com/r/paddlepaddle/paddle/tags](https://hub.docker.com/r/paddlepaddle/paddle/tags):
+
+```shell
+docker build -t paddlers:latest -f Dockerfile . --build-arg PPTAG=2.4.1-gpu-cuda11.7-cudnn8.4-trt8.4
+```
+
+- (Optional) If you need to use the interactive segmentation annotation function provided by [EISeg](https://github.com/PaddlePaddle/PaddleSeg/tree/develop/EISeg), you can set `EISEG="ON"`. By default, only the extensions that support remote sensing annotation are installed:
+
+```shell
+docker build -t paddlers:latest -f Dockerfile . --build-arg EISEG="ON"
+```
+
+## 2. Image Use
+
+- View the currently built images and remember the `<imageID>` of the image to be started:
+
+```shell
+docker images
+```
+
+- To use only PaddleRS (including EISeg), you can directly start the image, and mount the folder where the model parameters are stored on the local machine to Docker. If you want to use the GPU, after docker 19, you can add the parameter in square brackets to enable the GPU:
+
+```shell
+docker run -it -v <local_folder_absolute_path:container_folder_absolute_path> [--gpus all -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all] <imageID>
+```
+
+- (Optional) If you need to use EISeg, you need to install and enable X11 on the local machine to receive the GUI interface of Qt. Windows can use [VcXsrv](https://sourceforge.net/projects/vcxsrv/), and Linux can use [Xserver](https://blog.csdn.net/a806689294/article/details/111462627). After the relevant tools are started, start EISeg:
+
+```shell
+eiseg
+```

+ 1 - 24
docs/quick_start_cn.md

@@ -62,30 +62,7 @@ python setup.py install
 ```
 
 
-除了采用上述安装步骤以外,PaddleRS也提供Docker安装方式。具体步骤如下:
-
-1. 从dockerhub拉取镜像:
-
-```shell
-docker pull paddlepaddle/paddlers:1.0.0  # 暂无
-```
-
-或者,可以选择从头开始构建。通过修改`Dockerfile`文件中的`PPTAG`,可选择PaddlePaddle的多种基础镜像。
-
-```shell
-git clone https://github.com/PaddlePaddle/PaddleRS
-cd PaddleRS
-docker build -t <imageName> .  # 默认使用PaddlePaddle 2.4.1的CPU版本
-# docker build -t <imageName> . --build-arg PPTAG=2.4.1-gpu-cuda10.2-cudnn7.6-trt7.0  # 构建使用GPU版本PaddlePaddle的环境
-# 其余tag可以参考:https://hub.docker.com/r/paddlepaddle/paddle/tags
-```
-
-2. 启动容器
-
-```shell
-docker images  # 查看镜像的ID
-docker run -it <imageID>
-```
+除了采用上述安装步骤以外,PaddleRS也提供Docker安装方式,具体请参考[文档](./docker_cn.md)。
 
 ## 模型训练
 

+ 1 - 24
docs/quick_start_en.md

@@ -55,30 +55,7 @@ cd paddlers/models/ppdet/ext_op
 python setup.py install
 ```
 
-We also provide a docker image for installation:
-
-1. Pull from dockerhub:
-
-```shell
-docker pull paddlepaddle/paddlers:1.0.0
-```
-
-Optionally, you can build the image from scratch. You can change the base images for different PaddlePaddle versions by setting `PPTAG` in `Dockerfile`.
-
-```shell
-git clone https://github.com/PaddlePaddle/PaddleRS
-cd PaddleRS
-docker build -t <imageName> .  # Default is PaddlePaddle-2.4.1-CPU
-# docker build -t <imageName> . --build-arg PPTAG=2.4.1-gpu-cuda10.2-cudnn7.6-trt7.0  # Use a GPU version of PaddlePaddle
-# For more tags, please refer to: https://hub.docker.com/r/paddlepaddle/paddle/tags
-```
-
-2. Start a container
-
-```shell
-docker images  # View the ID of the image
-docker run -it <imageID>
-```
+We also provide a docker image for installation, see [here](./docker_en.md).
 
 ## Model Training
 

+ 27 - 0
examples/building_extraction/Dockerfile

@@ -0,0 +1,27 @@
+# 0. from base paddlers image
+FROM paddlers:latest
+
+# 1. install mysql and nodejs
+RUN apt-get update \
+	&& apt-get install -y mysql-server mysql-client libmysqlclient-dev \ 
+		git curl \
+	&& curl -sL https://deb.nodesource.com/setup_16.x | bash - \
+	&& apt-get install -y nodejs
+
+# 2. clone geoview
+WORKDIR /opt
+RUN git clone https://github.com/PaddleCV-SIG/GeoView.git \
+	&& mv PaddleRS GeoView
+ENV PYTHONPATH /opt/GeoView/PaddleRS
+
+# 3. install backend requirements 
+WORKDIR /opt/GeoView/backend
+RUN pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple \
+	&& mv .flaskenv_template .flaskenv
+
+# 4. install frontend requirements 
+WORKDIR /opt/GeoView/frontend
+RUN npm install
+
+# 5. finish
+WORKDIR /opt/GeoView

+ 185 - 0
examples/building_extraction/README.md

@@ -0,0 +1,185 @@
+# 基于Docker环境的PaddleRS全流程建筑物提取
+
+PaddleRS提供了对遥感影像的训练和推理能力,那么结合EISeg提供的标注能力和GeoView提供的部署和展示能力就能全流程的完成遥感的语义分割任务了,此说明将基于docker环境完成使用上述工具对卫星影像中的建筑物从标注到训练再到部署的全流程。
+
+## 〇、准备
+
+- 构建镜像并运行镜像,详细过程请参考PaddleRS关于Docker构建的[文档](../../docs/docker_cn.md)。这里使用的容器文件夹绝对路径为`/usr/qingdao`。该路径包含一张青岛的tif影像。
+- 使用[GeoView](https://github.com/PaddleCV-SIG/GeoView/tree/develop)提供的智能遥感影像解译功能,可在完成PaddleRS镜像的基础上安装该文件夹内提供的镜像。若PaddleRS的基本镜像名称自定义为`<imageName>`,需要编辑该文件夹下的`Dockerfile`,将`From paddlers:latest`改为`From <imageName>`,然后构建镜像:
+
+```shell
+docker build -t geoview:latest -f Dockerfile .
+```
+
+## 一、数据标注
+
+- 安装paddlers。
+
+```shell
+python setup.py install
+```
+
+- 首先进行切分图像,虽然EISeg可以直接读取大图进行分块标注和保存,但为了控制标注的数量,可以先使用PaddleRS提供的切分工具。
+
+```shell
+cd tools/
+python split.py --image_path /usr/qingdao/qingdao.tif --block_size 512 --save_dir /usr/qingdao/dataset/
+```
+
+- 等待进度条完成后则数据划分完毕。此时将建筑交互式模型参数[下载](https://paddleseg.bj.bcebos.com/eiseg/0.4/static_hrnet18_ocr48_rsbuilding_instance.zip)到共享文件夹中,打开[VcXsrv](https://sourceforge.net/projects/vcxsrv/)(宿主机系统为Windows10),准备使用EISeg进行标注。具体操作参考PaddleRS关于Docker构建的[文档](../docker/README.md)中关于EISeg的使用部分。
+
+![eiseg](https://user-images.githubusercontent.com/71769312/222040539-34a369f3-6da8-4047-a3a5-ebf9b831d175.png)
+
+- 加载模型和数据后标注情况如下。关于EISeg的使用方法请参考[这里](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.8/EISeg/docs/image.md)。
+
+![eiseg_labeling](https://user-images.githubusercontent.com/71769312/222041481-da2398e4-b312-418f-9cf7-e2c22badfe8a.png)
+
+- 标注后产生的结果如下。
+
+![annotation](https://user-images.githubusercontent.com/71769312/222097042-6f65048e-c20b-4650-a33a-516bb4bb7963.png)
+
+## 二、模型训练
+
+- 标注完成后可参考PaddleRS的[训练文档](../../tutorials/train/README.md)进行训练。对于标注好的数据,在EISeg的保存目录中为如下格式:
+
+```
+dataset
+  ├- label
+  |    └- A.tif
+  └- A.tif
+```
+
+- 因此需要将数据移动为下列格式:
+
+```
+dataset
+  ├- label
+  |    └- A.tif
+  └- image
+       └- A.tif
+```
+
+- 然后生成对应的数据列表,可以在`dataset`中新建如下脚本文件并运行:
+
+```python
+import os
+import os.path as osp
+import random
+
+if __name__ == "__main__":
+    data_dir = "/usr/qingdao/dataset"
+    img_dir = osp.join(data_dir, "images")
+    lab_dir = osp.join(data_dir, "labels")
+    img_names = os.listdir(img_dir)
+    random.seed(888)  # 随机种子
+    random.shuffle(img_names)  # 打乱数据
+    with open("/usr/qingdao/dataset/train.txt", "w") as tf:
+        with open("/usr/qingdao/dataset/eval.txt", "w") as ef:
+            for idx, img_name in enumerate(img_names):
+                img_path = osp.join("images", img_name)
+                lab_path = osp.join("labels", img_name.replace(".tif", "_mask.tif"))
+                if idx % 10 == 0:  # 划分比列
+                    ef.write(img_path + " " + lab_path + "\n")
+                else:
+                    tf.write(img_path + " " + lab_path + "\n")
+```
+
+- 完成后得到标准的数据集结构,以Farseg为例,可以参照[Farseg的训练文件](../../tutorials/train/segmentation/farseg.py)进行训练。进入路径`../tutorials/train/segmentation`修改`farseg.py`中的数据集路径,并把数据下载和选择前三个波段进行注释:
+
+```python
+# 数据集存放目录
+DATA_DIR = '/usr/qingdao/dataset/'
+# 训练集`file_list`文件路径
+TRAIN_FILE_LIST_PATH = '/usr/qingdao/dataset/train.txt'
+# 验证集`file_list`文件路径
+EVAL_FILE_LIST_PATH = '/usr/qingdao/dataset/eval.txt'
+# 数据集类别信息文件路径
+LABEL_LIST_PATH = '/usr/qingdao/dataset/labels.txt'
+# 实验目录,保存输出的模型权重和结果
+EXP_DIR = '/usr/qingdao/output/farseg/'
+
+# 下载和解压多光谱地块分类数据集
+# pdrs.utils.download_and_decompress(
+#     'https://paddlers.bj.bcebos.com/datasets/rsseg.zip', path='./data/')
+
+# T.SelectBand([1, 2, 3]),
+```
+
+- 其中`labels.txt`可以手动创建,里面为:
+
+```
+background
+building
+```
+
+- 然后可以对下面的超参数进行调整,调整完成后保存退出,使用下列命令进行训练:
+
+```shell
+python farseg.py
+```
+
+![train](https://github.com/geoyee/img-bed/assets/71769312/03920e88-97cf-40d5-b468-29fa1c4da57d)
+
+## 三、可视化
+
+- 新建一个终端,启动镜像加载后端,将训练好的模型挂载到容器内:
+
+```shell
+docker run --name <containerName> -p 5008:5008 -p 3000:3000 -it -v <本机文件夹绝对路径:容器文件夹绝对路径> [--gpus all -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all] <imageID>
+```
+
+- 启动MySQL:
+
+```shell
+service mysql start
+mysql -u root
+```
+
+- 注册MySQL的用户并赋予权限:
+
+```shell
+CREATE USER 'paddle_rs'@'localhost' IDENTIFIED BY '123456';
+GRANT ALL PRIVILEGES ON *.* TO 'paddle_rs'@'localhost';
+FLUSH PRIVILEGES;
+quit;
+```
+
+- 进入后端,根据实际修改flaskenv:
+
+```shell
+cd backend
+vim .flaskenv
+```
+
+- 设置百度地图Access Key,百度地图的Access Key可在[百度地图开放平台](http://lbsyun.baidu.com/apiconsole/key?application=key)申请:
+
+```shell
+vim ../config.yaml
+```
+
+- 参考GeoView的文档进行[模型准备](https://github.com/geoyee/GeoView/blob/develop/docs/dev.md),将模型导出为部署模型,使用以下脚本:
+
+```shell
+cd /opt/GeoView/
+mkdir -p backend/model/semantic_segmentation
+cd /opt/GeoView/PaddleRS/
+python deploy/export/export_model.py --model_dir=/usr/qingdao/output/farseg/best_model/ --save_dir=/opt/GeoView/backend/model/semantic_segmentation/farseg/
+```
+
+- 启动后端:
+
+```shell
+python app.py
+```
+
+- 新建一个终端,根据上面的`<containerName>`来启动前端:
+
+```shell
+docker exec -it <containerName> bash -c "cd frontend && npm run serve"
+```
+
+- 进入到`http://localhost:3000/`,这里我们已经按照要求将训练好的模型放到了`backend/model/semantic_segmentation`文件夹下,可以看到在`地物分类`的可选模型中,已经有了我们放过去的模型。
+
+![geoview](https://github.com/geoyee/img-bed/assets/71769312/7228c87c-5d2a-4e4a-bd98-b76a6a791b68)
+
+- 上传图像,开始处理,就能得到可视化的结果了。