Pygrunn: cloud native geospatial formats for field boundaries - Ivor Bosloper

Tags: pygrunn, python

(One of my summaries of the 2025 pygrunn conference in Groningen, NL).

Cloud native geospatial file formats:

  • Geospatial data: you have raster data (= images) and vector data. And point data.

  • Raster data: geotiff, png, jpg. Vector: (shapefiles), gpkg, geoparquet. Points: gpkg, geoparquet.

Cloud native? Let’s look at geotiff for instance. Just the old .tiff format, so a raster of pixels with some metadata. A geotiff has metadata like extent, projection, etc. There is a cloud native variant, cloud optimized geotiff.

  • You have tiles, so the big image is subdivided into tiles for easier/cheaper/faster loading.

  • There are also multiple versions of the image at various “zoom levels”.

  • The metadata is always at a fixed place in the file, right at the front or at the back.

Such a cloud optimized format means that it is optimized for remote geospatial access patterns. The way it happens is with “http range requests”. After reading the metadata for the file, the algorithm knows which parts of the big file to request from the server with such a http range request.

He wanted to do the same for vector data. An approach is GeoParquet. Parquet is a bit of a “csv format”, simplified. For speed reasons it is subdivided in blocks. In the geospatial version, the blocks have an extent. (An extent is the min/max boundary around the data, btw).

Before cloud native geospatial formats, you really needed to have s special server program to host them, like geoserver. Geoserver is nice, but it is also a huge java program with loads of options. (And most people forget to properly secure it…)

What you can do now is that you can just store your cloud-native geopspatial file online in for instance s3. As long as it supports http range requests, you’re set. The big advantage is that there are good specifications and lots of implementations.

He’s now working on FIBOA: FIeld BOundaries for Agriculture. An open source and open data project. There are many open data portals with agricultural field boundaries. But all of them have different formats. FIBOA wants to unify all that. See https://github.com/fiboa/specification

For converting the current local data to their format, they used lots of python and (geo)pandas. They’re trying to generalize the python+geopandas+extract+export process, as it seems handy for lots of other use cases: https://github.com/vecorel/

https://reinout.vanrees.org/images/2025/pygrunn-4.jpeg

Photo explanation: picture from our Harz (DE) holiday in 2023

 
vanrees.org logo

Reinout van Rees

My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):