An example of the kind of project he works on: http://tracker.geops.de/ . Also live maps of the train’s location on on-board displays in Swiss high speed trains.
A problem for this kind of work is that there are lots of different data sets. Creating a single unified network map is hard. You have to connect the different formats. The data models also differ. Is a double track railway one railway or two tracks?
They use a nodes+graphs model for the railways. Whenever possible, they use open data. And when possible, local official geographical data. When not available, they use openstreetmap.
Technically, they use postgres/postgis. All the calculation also happens in
the database. For importing, they used existing tools. An important
post-processing task was extracting the actual railways from the individual
track info in openstreetmap. They had to remove the side tracks and then
average the tracks to get the railway. For the average, they placed a buffer
around the available tracks and then called
A problem was the difference you can have per country. They detect the main tracks by looking at railway numbers. But for instance in France, these numbers are often missing...
The buffer size is also a problem. Mountain railways with a small radius that wind slowly up the mountain side can be accidentally merged together if the buffer size is too big. Similarly with overpasses near stations.
In the end, you have to close holes left over by the
The last task was to tie together the various data sets on the data set borders (normally the borders of countries).
So: openstreetmap data works well and it is universially available throughout the world, but there are big problems with completeness/quality in many countries. The railway company’s own data is often of better quality, but you need to do extra work to import it.
Photo explanation: for this railway-related talk, I find this image of my own model railway (‘Eifelburgenbahn’, under construction) quite appropriate.
The number of daily users has almost doubled in the recent years.
Vandalism? Deleting an entire city block? You must distinguish between accidental changes and real vandalism. It is easy to try out an editor and change something in the real database.
An example: pokemon. People thought that the pokemon app uses openstreetmap data to locate footpaths and parks. So suddenly people started adding parks and footpaths around their house. Luckily they often added “park for pokemon” in the comments.
He invested 15-45 minutes a day during a month to try and detect vandalism. Lists of new users: review their edits. For instance. The best solution if you see something that’s not right: ask them in a friendly way. Stay relaxed. As a second step, notify local mappers. As a last ditch measure: revert the edit.
What he saw:
Regarding the answers he got back: out of 300 comments, he got only 70 answers. 20 of those were of the original author of the changeset. All those 20 answers were friendly! He did have to wait about 24 hours on average before getting an answer. So don’t expect an email right away.
He showed a lots of nice statistics. For that, you’ll probably have to look at his blog. Start at http://neis-one.org/2017/01/reviewing-osm-contribution-1/, you’ll find links to the tools he used there. For instance http://resultmaps.neis-one.org/osm-suspicious .
He has some ideas for future tools. Push instead of pull notifications. Coordination of reviews. etc.
Osmium started out as a C++ program. There are now python (PyOsmium) and nodjs bindings.
He added a command line tool: osmium-tool. It works on osx, linux and windows. There are ubuntu/debian packages, too.
(He demonstrated it on the command line: that is not something I can summarize in text :-) It looked well-done and useful!)
Free satellite images can be very useful. But you must take care. Giving away satellite images can happen for a number of political and economical reasons. His definition of free is:
There are two basic types of satellites. Geostationary weather satellites. These satellites always take pictures of the same area. A big advantage is that they can take images with high frequency as they’re always above the same area. USA satellites have quite some free image data. Japan fewer and with limitations. EU not.
The second type is that of the earth observation satellites. These fly over the north and south pole and they are synchronized with the sun. One “round” takes about 1.5 hours. This way they cover most/all of the earth every day.
VIIRS. Data available at https://worldview.earthdata.nasa.gov
MODIS. Nasa. The first really open satellite data. With a 15 year archive. This is the most often used one. (Also available at above url).
Landsat. US geological service. The most well-known one. This one has the greatest amount of data. It was not always free data. And the polar areas and the seas are missing.
They tried for 20 years to make it into a commercial venture, which didn’t work out. So it came back into governmental hands. Since 2008 all data is open.
Almost live images: https://www.mapbox.com/bites/00145
EU satellites? Traditionally there is no open data. But the landsat open data success prompted a re-thinking. So now there is the COPERNICUS program.
Sentinel tries to copy the landsat success. It now produces the most complete data set. But: it focuses on Europe, Africa and Greenland. There also is no data from before oktober 2016.
Data is available at https://scihub.copernicus.eu/
See also my summary of the previous talk about free satellite images!
Christian works for the German aerospace center. The “copernicus” program consists of three parts:
The space component. The “sentinel” satellites. Various types/goals. High resolution images, medium resolution images, altimetry (height measurements), atmospheric chemestry, etc. Not all of them have been launched yet.
In-situ component. Local on-the-ground measurements for validation of the satellite images.
Services component. Data access, calculation. This includes making products available that are directly usable without you having to have the knowledge to create it from the raw data.
There is also an “emergency management service”. In case of a big accident or natural disaster, data is made available immediately.
Land monitoring service: specific maps like land surface temperature, vegitation index, soil water index, etc. There is also extra (detailed) data for Europe.
A starting point for searching data is http://copernicus.work-with.us/, the official info is at https://scihub.copernicus.eu/dhus/ . But: this central portal is often overloaded. The idea is that the individual countries take care of making local hubs. Partially, the data is also available via google’s earth engine.
In Germany, https://code-de.org/ (“COpernicus Data and Exploitation platform”) is now online. There is also a listing of open source tools you can use.
He works for the Bundesamtes fuer Strahlenschutz, basically the government agency that was started after Chernobil to protect against and to measure radioactivity. The software system they use/build is called IMIS.
IMIS consists of three parts:
They have a simple map at odlinfo.bfs.de.
The current core of the system is proprietary. They are dependent on one single firm. The system is heavily customized for their usage.
They need a new system because geographical analysis keeps getting more important and because there are new requirements coming out of the government. The current program cannot handle that.
What they want is a new system that is as simple as possible; that uses standards for geographical exchange; they don’t want to be dependent on a single firm anymore. So:
They use open source companies to help them, including training their employees. And helping getting these employees used to modern software development (jenkins, docker, etc.)
If you use an open source strategy, what do you need to do to make it fair?
(Personal note: I didn’t expect to hear ‘buildout’ at this open source GIS conference. I’ve helped quite a bit with that particular piece of python software :-) )
They combine openstreetmap data with ALKIS data. ALKIS is the german cadastre database. Both are imported into postgres/postgis.
In postgres, it is handy to use separate namespaces (postgres calls them “schemas”, but that is a bit of a confusing name).
When doing a project for 38 “Kreisen” in the Ruhr area, they imported all the 38 datasets each into a separate namespace/schema to keep them apart.
In the end, it is handiest to end up with one table per map layer. You can select multiple data by doing an sql UNION. You’ll probably need a lots of sql tricks to get it all working.
Instead of working with these complex queries, it is better to create a view with the complex query so that your map server has just a simple table/view to talk to.
Optimization: first measure what you have to
log_min_duration_statement is a good postgis setting to see
long-running queries in your logfile.
EXPLAIN ANALYZE tells you what
postgres does behind the scenes.
Watch out with the spatial queries that mapnik and mapserver wrap around your
own query: they add a bounding box query on the geometry. But.... if the
geometry is a calculated value (centroid, for instance), the bbox query cannot
use an index. Solution is to add
!BBOX! in your query, this tells mapnik
to do the bbox query there inside your original query, which allows postgres
to use the index anyway.
What also helps: simplifying your data. Leaving out unneeded information with database filters or geometrical simplifications (straightening lines a bit when that level of detail is not visible).
His sheets (in German) with more details are here: https://talks.omniscale.de/2017/fossgis/postgis/
For rendering openstreetmap tiles, there are basically two approaches: bitmaps or vector data.
Bitmaps are the classic. Normally you start with something like OSM PBF or geotiff. Then you render it with, for instance, mapnik. Mapnik is configured with xml, which you don’t really want to do. So you now have CartoCSS or MapCSS: much friendlier formats. Those can then be converted to mapnik’s xml with tools like tilemill, kosmtik, manakarto or komap.
So... with data sources and styles, a renderer (mapnik) can create the bitmap tiles. Then you need to host it. TileStache+nginx, apache+mod_tile, etc.
A problem: if you render the same source with multiple styles, you have to do the same work multiple times and that takes lots of calculation time. A solution is “fat vector tiles”: you use mapnik to create a vector tile with lots of feature information out of postgis. Afterwards, you can use mapnik again to combine the fat factor tile with styling to (much more cheaply) render the actual bitmap tile.
You can go further. Vector tiles. The rendering of the pixels is moved to the client (the browser). You could generate fat vector tiles beforehand and store them in mbtiles. The server then combines those fat vector tiles with styling to send vector tiles to the browser.
With engineer firms from the Aachen region they created qkan. Qkan is:
It has been designed for the needs of the engineers that have to work with the data. You first import the data from the local sewer database. Qkan converts the data to what it needs. Then you can do simulations in a separate package. The results of the simulation will be visualized by Qkan in qgis. Afterwards you probably have to make some corrections to the data and give corrections back to the original database. Often you have to go look at the actual sewers to make sure the database is correct. Output is often a map with the sewer system.
Some functionality: import sewer data (in various formats). Simulate water levels. Draw graphs of the water levels in a sewer. Support database-level check (“an end node cannot occur halfway a sewer”).
They took care to make the database schema simple. The source sewer database is always very complex because it has to hold lots of metadata. The engineer that has to work with it needs a much simpler schema in order to be productive. Qkan does this.
They used qgis, spatialite, postgis, python and qt (for forms). An important note: they used as many postgis functionality as possible instead of the geographical functions from qgis: the reason is that postgis (and even spatialite) is often much quicker.
With qgis, python and the “qt designer”, you can make lots of handy forms. But you can always go back to the database that’s underneath it.
The code is at https://github.com/hoettges
GIS workplace? He means how you work day-to-day with you computer. “Fat” client with local files? Perhaps reaching out to web services? A problem is that it is quite expensive to support and maintain. It also doesn’t scale well.
A GIS workplace is quite complex when compared to many other use cases. Lots of different data, lots of tools, lots of data sources, lots of necessary connections.
You could look at full-features “web gis”, so more-or-less a desktop GIS application on a server. It is still expensive to build. Normally, it is less feature-rich. Scaling is still a problem as you need to build a web server cluster or so.
What looks promising: look at the cloud.
A closer look at AWS appstream 2.0, available since late 2016. You only need a modern html5 browser. It starts up quickly. You pay per use. And you can configure it just as you wish to, including data that you already configure for your users.
You can make it even more easy when you use devops tools like Ansible for continious delivery.
Statistics: charts of posts per year and per month.
My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):