delegate.rst 7.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596
  1. .. _community_vector_mosaic_delegate:
  2. Vector Mosaic Datastore Delegate Requirements
  3. =============================================
  4. The Vector Mosaic Datastore Delegate is a datastore that contains references to the vector granule datastores, bounding polygon or multipolygon geometry to delineate the index area, and optionally other attributes that can be queried in order to return vector granules.
  5. The delegate datastore can be in any format that GeoServer supports but there are two required fields:
  6. * There must be a geometry field representing the index spatial area in either Polygon or MultiPolygon form. There are not requirements on the name of such field.
  7. * There should be a field called ``params``, in text format, that contains either:
  8. * The name of a store already configured in GeoServer (useful to handle few granule stores, and avoid re-creating the store at every read). The string is considered a potential name if it does not contain an equal sign (making it a candidate for property format) or a colon or having path separators (making it a candidate for URI/URL).
  9. * The URI/URL pointing at granule resources like shapefiles, GeoPackage, FlatGeobuf, etc. (for simplicity).
  10. * A configuration string in .properties format. (See `Java Properties file <https://en.wikipedia.org/wiki/.properties>`_ for more details about the format).
  11. In addition to that, the following fields are optional:
  12. * ``type`` indicates the typename to be used when querying the granule store. Useful when the target store can contain multiple feature types. If not present, it's recommended to target a store with a single feature type (e.g., Shapefile, FlatGeoBuf).
  13. * ``filter`` is a (E)CQL filter that can be used to cherry pick the features to be read from the delegate store. This is useful when the delegate store contains a large number of features, and only a subset of them are of interest for the given set of index record attributes.
  14. Any other field beyond the two required can serve as queryable/filterable attribute, and will be used to narrow the number of potential granule vectors that are searched by a query. The non-required parameters will be combined with the vector granule parameters to create the output feature type.
  15. An example of a delegate in property datastore format can be found `here <https://github.com/geotools/geotools/blob/main/modules/unsupported/vector-mosaic/src/test/resources/org.geotools.vectormosaic.data/mosaic_delegate.properties>`_. The ``name`` field can be used to filter the granules, while ``params` contains the location of the granule file and ``geom`` its footprint. The ``type`` field can be used for filtering, but also indicates the name of the feature type to be used when querying the granule store (in this case, happens to match the name of the target shapefile).
  16. Creating an Index with ogrtindex
  17. ================================
  18. The `ogrtindex <https://gdal.org/programs/ogrtindex.html>`_ commandline tool from the GDAL library can be used to collect all data sets in a directory, and create an index table for it. The format of the location is slightly different than the one GeoServer expects, as it uses a ``location,tableIndex`` format, so a quick SQL needs to be run to make it match.
  19. Here is an example that generates a delegate shapefile from a directory of shapefiles. The third step below uses ``ogr2ogr`` commandline to trim a comma and number that ``ogrtindex`` appends to the end of the granule reference, and to turn the file location into a valid URL.
  20. #. Switch to directory with the shapefiles
  21. #. ogrtindex -write_absolute_path -tileindex "params" delegate_raw.shp \*.shp
  22. #. ogr2ogr delegate.shp delegate_raw.shp -dialect SQLite -sql "SELECT Geometry,'file://'||SUBSTR(params,1,LENGTH(params)-2) AS params from delegate_raw"
  23. The ``delegate.shp`` shapefile can then be published as a store in GeoServer (no need to publish the layer), and then the mosaic store can be created, referencing to it:
  24. For example, let's say one downloads the `TIGER shapefile <https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html>`_ for the ``PLACE`` theme,
  25. providing a shapefile with urban areas for each of the US states:
  26. .. figure:: images/places-files.png
  27. :align: center
  28. Scripts exist that help with the bulk download of the files for a given theme and year, e.g.
  29. `get-tiger <https://github.com/fitnr/get-tiger>`_.
  30. ``ogrtindex`` and ``ogr2ogr`` can be used to generate a index shapefile, which will be
  31. then configured in GeoServer, and then serve as the base for mosaic store:
  32. .. figure:: images/places-stores.png
  33. :align: center
  34. *The store containing the delegate/index table, and the mosaic store*
  35. .. figure:: images/places-mosaic-config.png
  36. :align: center
  37. *The mosaic store refers to the delegate store by name*
  38. The ``connectionParameterKey`` is ``url``, as that's what the Shapefile datastore is looking for,
  39. a parameter named ``url`` with the location of the shapefile to open. The preferred SPI is
  40. setup to the Shapefile store to speed up the lookup of the granule store (it can be omitted,
  41. with a small performance drop).
  42. The mosaic layer can then be published in GeoServer, rendering all the required shapefiles
  43. in a single map:
  44. .. figure:: images/places-mosaic.png
  45. :align: center
  46. Creating FlatGeobuf Granules with ogr2ogr and ogrtindex
  47. ======================================================================
  48. `FlatGeobuf <https://flatgeobuf.org>`_ files make for an excellent option for cloud storage of granule data due the built in support for R-Tree indices and the use of `HTTP Range requests <https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests>`_ to limit the amount of data streamed over the network.
  49. ``ogr2ogr`` can be used to convert a directory of shapefiles into a directory of indexed Flatgeobufs using a simple bash script like below
  50. .. code-block:: bash
  51. #!/bin/bash
  52. shopt -s nullglob
  53. FILES="/data/tiger/*.shp"
  54. for f in $FILES
  55. do
  56. fgbfilename="$(basename $f .shp).fgb"
  57. ogr2ogr -f FlatGeobuf $fgbfilename $f -nlt PROMOTE_TO_MULTI -lco SPATIAL_INDEX=YES
  58. done
  59. Here is an example that generates a delegate shapefile from the directory of FlatGeobufs. The third step below uses ``ogr2ogr`` commandline to trim a comma and number that ``ogrtindex`` appends to the end of the granule reference, and to turn the file location into a valid URL. Note the exclusion of the ``write_absolute_path``. Instead we append the AWS S3 bucket URL to the generated filename.
  60. #. Switch to directory with the FlatGeoBufs.
  61. #. ogrtindex -tileindex "params" delegate_raw.shp \*.fgb
  62. #. ogr2ogr delegate.shp delegate_raw.shp -dialect SQLite -sql "SELECT Geometry,'https://mybucketname.s3.amazonaws.com/'||SUBSTR(params,1,LENGTH(params)-2) AS params from delegate_raw"
  63. #. Upload FlatGeobuf granule files to the S3 bucket referenced in the earlier step (and confirm that the bucket contents are publicly available).
  64. At this point you can publish the ``delegate.shp`` shapefile as a store in GeoServer as described in the previous example or you can load it into PostGIS before publication (see `Smart Data Loader <../smart-data-loader/data-store.html>`_ for a tool for creating the PostGIS store). A PostGIS delegate is especially useful mosaics that might change over time due to support for concurrent edits, high rate loading and transactions. Note that because the granule references in the index are HTTPS URLs the index FlatgeoBuf can be hosted anywhere that your GeoServer installation can access.