2023-09-21
(One of my summaries of the 2023 Dutch edgecase k8s conference in Utrecht, NL).
Artificial intelligence is all the rage nowadays. Mostly due to chatgpt being introduced a year ago.
According to Eelko, we are now living in the “alchemy area” of AI: we know it works, but we don’t know why/how it works. And we don’t know what it is capable of. “It is science, it is alchemy. The science behind AI is yet to come.”
What we previously thought AI would do: robotic house chores, transportation, creativity. “Transformers” are the newest revvolution. LLMs are a subset of transformers. Transformers are now used for languages, images, video, 3d modelling, etc.
On the one hand you’re currently apparently absolutely professionally required to use AI while coding. Chatgpt, github copilot, codewhisperer, etc. They aid developers in terrific ways. You program 10x as quickly. (Personal note: a 10x improvement was mentioned in Fred Brooks’ no silver bullet…)
On the other hand, LLM’s can’t necessarily be trusted. Hallucinations. They lack understanding of multi-layered concepts. And sometimes they generate code based on generated code… So don’t trust it per se. You can use it for simple questions.
LLMs are based on text. They are aimed at telling a story. If you ask it to solve an equation, it will give a well-worded wrong answer.
Eelko encourages everyone to use tools like github copilot: they’re great. It is good at explaining code in regular English or in flow diagrams. But always triple check the code if you generate something.
Now on to kubernetes. Chatgpt understands the main kubernetes concepts. Something to look at:
k8sgpt: scanning and diagnosing your cluster’s errors.
D2iQ: AI chatbot.
OpenLLM’s containerised LLM support.
They did an experiment with an AI bot that tried to fix a faulty cluster via chatgpt. No success, as the regular method of fixing a problem was to delete the resource. Or even the entire deployment: something that isn’t there doesn’t produce errors, so “success”!
Now, what does AI have in store for us? We don’t really know. Sometimes, with a size increase, an LLM suddenly can do things it couldn’t do before. They sometimes really surprise us. He thinks new LLM models will be smaller faster and cheaper. There’s no real threath, AI at the moment is just an aid. Complex systems will stay out of reach for now.
He thinks education will change a lot. Critical thinking and continous learning will be a major skill. As well as mental and emotional health. Continuous learing, as what you’ve learned will be out of date once you’ve learned it.
Personal notes
On that last point, I don’t think it will be like that. The laws of physics won’t change from under your feet. The structure of the human body will be the same after you finished your education. Napoleon will still have been emperor.
If I can trust my daughter who studies AI, we know exactly how chatgpt works theoretically. Scientifically it is “just” a “large language model” which is a known technology within AI. Chatgpt is well-marketed, though. Oh, and it is probably producing prodigious amounts of CO2. And it is not open and it is based on years of inherently biased content.
Yeah right, I’m probably an old fossil, destined for the dustbin of history…
Anyway, that’s better than thinking we humans will be overtaken by AIs in six years time. “In a few years AIs will be superior and we will be like ants to them. They won’t exterminate us, though, we’re useful to them.” Yeah right… I hope that last comment (in response to a question) was intended as a joke. It probably was and I’m probably getting a bit too grumpy.
There’s an AI conference in Groningen this november: https://aigrunn.org/ . “The AI tech event for software professionals”. I normally go to the pygrunn python conference by the same organisers: always good and well-organised and relaxed. I’m going to this one, too. I have to learn more about AI :-)
2023-09-21
(One of my summaries of the 2023 Dutch edgecase k8s conference in Utrecht, NL).
Pipelines in Jenkins. Later pipelines in a local gitlab installation. Then pipelines in github. Then the security officer found out about pipelines running on github, probably, in the USA. So they had to move back. Rewriting, rewriting, rewriting.
He thought “keep it simple: let’s do it ourselves!” What did they really need?
Git as input
Pipelines
Some output
For the pipelines, they chose tekton. Argocd was also possible, btw.
Kubernetes is very good at orchestration. You can get enterprise solutions, but we’re all moving to as-basic-as-possible to prevent lock-in. But with git we go in the opposite direction: nobody uses plain git, everybody uses github or gitlab and clicks all the enterprise functions. Lock-in! Why do we do that?!?
When you have a kubernetes-based pipeline that you made yourself, you can run it locally. Way nicer than doing commit after commit to test out your pipeline on github/gitlab.
At this point I figured out he was mostly talking about deploy pipelines. I initially thought it was also about software tests, as that’s what I was running in jenkins and now on github… Once he said to not store the pipeline config in the same repository as the code I caught on. Sorry, that’s what you get when a developer goes to a kubernetes conference.
powerflex is a USA company with solar power installations, electricity storage solutions, EV chargers, etc. Equipment that’s often in remote locations with horrible network connectivity.
Their initial setup was with Rancher k3os, a minimal linux for easily installing K3S. But they ran into problems like lack of custom driver support and no remote reboots. In the end community and company support for k3os was also lacking. Development has stopped in the meantime.
So… they moved to Talos linux (from sidero) as the os/kubernetes combination.
Small and fast.
Hardened for remote locations. Very secure. Immutable. You cannot even write to disk. The OS effectively runs from a RAM disk. No shell/ssh access. No regular GNU utils. No nothing. Entire classes of attacks are impossible.
Simple to manage. Configuration is done through one single yaml file.
Same image everywhere. It is independent of the hardware, so you can really have multi-cloud that’s similar. Edge locations combined with the regular cloud? All the same.
Installation is simple for the most part. To help with the last 20%, they made Sidero omni:
Single management plane.
Fewer deep IT skills required. A fresh node can register itself and Omni can take care of the rest.
Enterprise grade authentication. Any regular oidc provider will do, like github, google, etc. With omni, all interaction is protected this way.
Highly available out of the box. Omni itself is of course HA. Edge locations should be treated as “cattle, not pets”, that’s what they really subscribe too. Just fire up a new machine or ship a new box.
Firewall friendly. You do need a connection to omni, but that’s it. Local image storage is fine, for instance. They have installations in hospitals without any egress (apart from the omni connection).
They’re working on version 2:
More flexible depoyment options.
Customizable builds.
More hardware support.
Reduced hardware load.
Back to the “powerflex” example. They migrated 450+ clusters, moving on to a 1000. Field technicians can provision boxes in a self-service manner. They’re working on having a warehouse with pre-imaged boxes, ready for shipping out and installing at a moment’s notice.
He showed a demo of omni/talos. In Omni, there’s the option to download a slightly customized version of the regular talos image: the only difference is that the container automatically safely “phones home” to your omni account and register itself.
In response to a question about upgrading/deprecation: they support the current kubernetes release and the three previous versions. That way, most people can stay up-to-date with talos without immediately needing to upgrade kubernetes. Talos itself aims to be as minimal as possible: you should not have to care about the OS. Kubernetes is where your worries should be: the OS should be as invisible and worry-free as possible.
A clarification by myself: I originally understood talos to be like k3os, an easy small linux OS to run k3s. But talos is more an integration of the two. Talos is the minimal OS plus the main kubernetes components.
2023-09-20
(One of my summaries of the 2023 Dutch edgecase k8s conference in Utrecht, NL).
A nice talk by Erwin de Keijzer. Full title: do you need to take general relativity into consideration when measuring your elevator?.
Erwin has an elevator inside his, otherwise quite normal, house. An old lady used to live there and she needed an elevator to get around the house. The elevator is real slow, but handy for moving heavy stuff around.
He wanted to do some automation around it… Some rules/goals:
Know where the elevator is
Add music when the elevator is moving
Play announcements when the elevator stops at a floor
Don’t break the elevator
Don’t mess with the internal electronics.
His first attemt was to use a cheap ultrasonic sensor. The lift shaft was too high, though. And it sometimes picked up the support beams instead of the lift. He now uses the sensor for his standing desk.
Attempt 2: “TF Luna LiDAR”. A raspberry pi is connected to it and uses NATS and some Go code to write the measurements to prometheus.
With the position handled, he moved to alerting. Stuck between floors (you have to keep pushing the button to move), moving too fast, moving too slowly, elevator out of bounds.
When investigating the data, he noticed that the location of the lift when at rest, shifts a little bit throughout the day. He added temperature measurement: perhaps it had to do with the temperature? Not really. So he started brainstorming. Perhaps the way the earth moves? General relativity? It could be anything.
So he imported all the data into a jupyter notebook and started scatterplotting and histogramming… In the end it was a small inaccuracy in the elevator itself when stopping at a floor. Though, if you squint at the graphs in the right way… a little temperature dependency seems to be present, but just a little.
He used kubernetes. The raspberry pi runs k3s and is tainted to just run the pi stuff. Another server in his home runs the rest.
He showed a video of his elevator in action :-) Music when the elevator is moving. Note: the volume is decreased the higher you are in the elevator shaft as the speaker is at the top of the shaft…
Funny project!
2023-09-20
(One of my summaries of the 2023 Dutch edgecase k8s conference in Utrecht, NL).
He is involved in the FinOps foundation. FinOps is about cooperation between the OPS side and the business side of companies. Cloud is critical to every business, but it is very different from regular IT procurement: it is decentralized, for instance. Small purchases can be made in all parts of the company. And the costs are variable instead of the old fixed costs of a regular data center.
They have a finops “framework”, based around personas:
Finops practitioner
Executive
Business/product owner
Finance/procurement
Engineering/operations
And around different phases:
Inform
Optimize
Operate
(I couldn’t really follow the style of presentation very well… so the summary is lacking a bit. I have a few hopefully useful snippets below).
The core seems to be that it is hard to attribute costs. An idea he mentioned is to label/tag everything in your kubernetes cluster. That gives you a start in attributing costs.
The cncf lists projects for FinOps, some of which are open source. An example is OpenCost. There’s also “FinOps open cost and usage specification” (FOCUS).
Somewhere during his presentation he mentioned DOK, data on kubernetes, when he mentioned running postgres inside a container. That’s something I’m going to check out.
FinOps is useful, but there’s more: GreenOps. The same reporting tools can be used for looking at your “cloud carbon footprint”. Important, as we only have one production environment: the earth.
Developer efficiency is really important. Portworx is a company that wants to help there. “Why can’t developers have self-service”? Running a database in your cluster? Fine! We want it.
Database as a service, storage automation, backup and disaster recovery. “Enterprise storage platform”.
You can use portworx pre-packaged in many commercial clouds, but also in your own clusters.
Watch out with experienced people. Especially those with 20 years IT experience and 0 years kubernetes experience. Don’t give them admin access to something they don’t understand. He gave an example.
kubectl diff
ignores extra fields in your environment. Argocd relies on
kubectl diff
. So argocd thinks everything is up to date. But there might
be extra manual fields in your production config, like a manual resource limit
or an extra label…
Kubernetes does what you tell it too. For instance, pruning your cluster removes everything that’s not in your config. Nice! But when you accidentally remove part of your config from your repository, just as quickly kubernetes cleans up your cluster… Watch out.
Restarting the internal image repository (person A) at the same time as updating the nginx containers from that same internal image repository (person B). Only…. the image repository was behind nginx that needed that same image repository… Circular dependency.
So:
Don’t trust experience.
Don’t put dependencies of your cluster inside your cluster.
How to bring cloud native concepts to the edge. For that Suse focuses on six things:
Onboarding. Getting remote machines to join up in your cluster.
Cluster management. Upgrading.
Observability..
Security.
Workload managment.
OS management. Keeping this up to date.
There are difficulties. Industrial IoT can be legacy environments. Lengthy lifesycles. Old hardware. Lack of standardisation. Something running an old Windows XP is pretty common to find. Traditionally often slow to adopt change.
Two main targets:
Getting the market to adopt cloud native techniques.
Interfacing with both legacy and modern IoT devices.
Having to use containers: yes, that’s clear. No need to discuss that. Some standardisation on how to run/manage it: yes please. What could really help: a pluggable mechanism for discovery of IoT devices in k8s. Plus integration and automation for apps to use those discovered devices.
He introduced Akri. A “resource interface” for connecting existing devices to kubernetes. USB stuff on a windows host, for instance.
Akri runs on a node and handles discovery. Upon discovery, a k8s service is started to expose the data. Application can then consume the data via said service.
He showed a quick live demo. Which actually worked, even after accidentally dumping the hdmi connector in his glass of water :-)
2023-09-20
(One of my summaries of the 2023 Dutch edgecase k8s conference in Utrecht, NL).
He wanted to automate his house. But he didn’t want to spend a lot of money. So: lots of second-hand computers, temperature monitors, pumps. The most expensive part was 1000 Euro (new price 3000, so that’s OK). Two cheap broken second-hand lawn mower robots that were combinable into one
Let’s start simple! He had an old analog electricity meter, so he tried to hook up a sensor that looked at the analog meter, but that didn’t really work. In 2020 he got a modern smart meter, so he could get cracking. Such a meter has a “P1” port for reading the measurement. He first ran it on a raspberry pi, but in the meantime he’s got it running in K3S.
A problem was that the container needed full host access to read out the serial port. The solution: a separate “serial2network” proxy that makes the serial data available over the network. Handy!
Next up: garden lighting. First a simple sensor + remotely operated switch. Battery life was an issue, as was water resistance and sensor degredation… His wife didn’t like it as it wasn’t very consistent. So they went back to a regular mechanical on/off switch…
But, he tried it again. Raspberry pi that grabs sunset times from the internet and switches on the lights 15 minutes before sunset. Nice. A further improvement was to put it into kubernetes. A cronjob for the sunset functionality. And the option to have an interface for manual adjustments.
Next up: monitoring the robot lawn mower. At the start, the mower worked just fine. But after they got a dog, it started to run into problems (a hole being dug in the garden, dog toys in front of the mower, etc). So he had to start monitoring.
Solution: a sensor in the robot mower’s “house” to see if he’s out into the garden or not. Mowing happens in specific time windows, so when he’s not home in time, a warning gets send to his personal slack channel. And of course the data is stored in some metrics software.
Next up: warming the water for the swimming pool. A pump pumps water into solar heaters on the roof and then into the pool. Initially, a regular timer made sure the pump was only on when it was sunny.
Let’s upgrade it. So: temperature sensor in the pool. Temperature sensor on the roof. Pump motor controller. And a raspberry pi to steer it. Data storage in some SQL database and a web interface. Later he changed the relay for the pump motor to a frequency steering system so that he could run the motor at half capacity: much cheaper.
All this was moved to kubernetes, too. Influxdb instead of the previous sql database. Grafana dashboards instead of the home-made web interface. Most of the software is written in python.
He set it all up as microservices. One dockerfile + some code for the lawn mower, one for the pool, etc. For his taste it was too much effort for his simple needs. Now it is just one dockerfile with all the code and separate config files to configure it. He continuous deployment via argocd :-)
At the end he showed Falco, an open source system for detecting threats and anomalies in containers, kubernetes and cloud services. Is someone running a console inside a container where that isn’t normally happening? It sends a signal.
Not something he really needed for his house automatiohn software, probably, but he installed it as he works for the company that builds Falco. Network connections that aren’t supposed to happen, for instance.
Vandebron is a sustainable-energy-company. In 2014, only 5% of the energy produced in the Netherlands was green. Their mission is to help get it to 100%. At the moment it is 40%. It is moving forward!
There are quite some challenges:
There is a move from central to decentral: instead of a few big power plants, you have lots of windmills. And solar panels on individual roofs.
Dumb to smart. The grid needs to be more flexible.
Fossils to green means electrifying, Which means more grid usage.
On demand to storage. A gas power plant can easily handle extra demand, but solar cannot. So you need more storage. Storage is really a problem at the moment.
Balancing the energy grid using IoT is something they focus on. Sometimes this means that they have to shut off wind turbines to prevent overcapacity and grid instability, for instance.
Something like “the energy grid” is “pretty important”. Critical infrastructure. So security is real important.
The starting point was an overpowered i7 intel NUC. Not industrial grade, though. Second ethernet port via usb. Ubuntu+k3s. No real update/upgrade policy. They wanted something better. Kubernetes was required, though.
They started looking at Talos linux “the kubernetes operating system”. And at “wireguard” tunnels towards the devices to prevent man-in-the-middle attacks. With a wind turbine out on sea, there’s quite some in-the-middle!
Wind turbines on sea: you need hardware that is real sturdy. For that, they worked with onlogic, a maker of such hardware. Industrial computers, rugged computers, panel PCs and edge servers. 75% of what they do is the industrial/rugged stuff.
“Industrial” means environments where humans normally work. Factories, 0-30 degrees temperature, not a lot of shocks. For wind turbines you need the rugged stuff: salt water, vibrations, storms…
What they came up with is a much more technically robust and stable version of the raspberry pi. Created in cooperation with raspberry. Works from -20 to +65 degrees celcius. Energy efficient. Shielded to withstand signal interference and to prevent interference to the outside world (important for sensitive measuring equipment and medical devices). Certified for many markets and industries.
2023-09-20
(One of my summaries of the 2023 Dutch edgecase k8s conference in Utrecht, NL).
He’s one of the founders of Fullstaq (the conference organisers). They started with aws/azure/etc. Working for everything from
Around 2019, hybrid cloud became popular. Part in the regular cloud, part in your own datacenter. For that you need it to be reasonably similar on both sides. Now, edge computing is on the rise. Internet of things, CDN edge locations, all sorts of things.
A big help for them was K3S. But the core that makes it all possible: gitops (argocd, flux, etc).
2023: kubernetes was released in 2014. So nine years. What’s coming up in the near future? We’re maturing and lots of cool things are happening. But… for almost all companies, that maturity isn’t actually true. Some cloud-only companies that recently started might use kubernetes to the full, but most are slowly transitioning. Common problems are complexity and skill set.
Complexity: kobernetes is only part of the solution. 10%. You also have observability, security, advanced networking, ci/cd, data/storage, multi-cluster/multi-cloud… Only after you have all those layers, then you’re really production-ready and only then you can start migrationg yout software.
“Complexity” means “operationlal overhead”, which offsets most of the benefits of kubernetes for many companies. He showed a picture of a regular kubernetes observability stack: 10 components or so. Compare that to the regular monitoring in regular companies: just nagios (or zabbix in our case)… That’s quite a step!
Going from real servers to virtual machines was pretty OK to do. Not a big step. Moving from virtual machines to container deployments (“docker compose on a VM”) is also a small step. But jumping to a kubernetes deployment is a huge step. Lots of things are different. They need help.
Help? A solution is baby steps. How much of the full “production ready” stack do you really need to show value? You don’t need a service mesh from the start… Neither complex networking.
You could use a SaaS instead of doing observability yourself. You can use ci/cd via github. Storage can be done in the public cloud (or ask your local sysadmin). K8s management can also be done in the public cloud.
Less complexity means less time and less money.
Skill gap. Big problem. What can help are workshops. If you want kubernetes to “land” in your company, you need to get more people involved and knowledgeable.
Chick-fil-A is a chicken sandwich restaurant chain, something I didn’t know. They don’t have locations in the Netherlands, that’s why. They’ve run kubernetes for quite a while at 3000 locations.
Unrelated personal comment… He showed some examples of the restaurant chain, mostly in the USA/Canada. Lots of drive-tru stuff. Two lanes. Experiments with four lanes. Restaurants in the middle of a parking lot. What I saw was an awful car-centric environment. I’m used to city centers where you can walk. Our office is in the center of Utrecht: walking and cycling. There it was a lost little restaurant surrounded by cars in a sea of asphalt. And still he mentioned “we want a personal connection with the customer”. Ouch. Horrible.
Anyway, back to kubernetes :-) What are some of the kinds of data? IoT like kitchen equipment, temperature sensors. The POS terminals. Payments. Monitoring data (he mentioned Lidar, which is point cloud radar, so I’m wondering what they’re monitoring with that: probably length of car queues or so).
Lots of forecasting is happening based on that data. Nice. Car queue length is fed back into instructions for the kitchen. Or whether someone extra needs to take orders outside.
They looked at lots of kubernetes edge solutions. AWS greengrass+outpusts, Nomad, etc. They all looked pretty heavy for their use case. The solution was K3S. A couple of intel NUC machines suffice. A standard partition scheme on top if it, plus ubuntu 18.04, plus K3S and then the applications on top of it. “Partition scheme” in this case means that the NUCs are always wiped and freshly installed. (He also mentioned “overlayFS”, which apparently helped them with updates, but I didn’t get that fully).
The apps on the edge K3S are things like local authentication, message broker, postgres+mongo, observability with prometheus/vector, vessel for gitops.
K8s on the edge: they also run it in the cloud, mostly to manage the 3000 edge locations. The edge locations don’t know about each other, they only have to talk with the central system.
Deployment. Their organisation is split into separate teams. An app team using python and datadog. Another app team with java and amazon cloud watch. An infra team for gitops (with “fleet” as the orchestrator and “vessel” on the edge nodes.
Every edge location has its own git repository (gitlab). Rolling out changes incrementally over locations is easier that way. It is also possible to roll out changes everywhere at the same time. Having one repo might have been theoretically nicer, but their approach is more practical for their reality.
Persistence strategy. Their approach is to offer best-effort only. No guarantees at the edge. Most of the data used at the edge is what is needed right at that moment, so a few minutes of data that is lost isn’t really bad. The setup is pretty resilient though. So you can use mongo and postgres in your app just fine. App developers are encouraged to sync their app state to the cloud at regular intervals.
Monitoring: at the edge, they use Vector as a place to send the logs and metrics to. Vector then sends on the errors to the cloud: the rest is filtered out and stays local. This is also handy for some IoT things like the fridge, which you don’t want to send to cloud directly for security reasons.
In the cloud, everything also goes to a Vector instance, which then distributes it to the actual location (datadog, cloudwatch, etc.)
Some principles:
Constraints breed creativity. Helps to keep it simple. For example, there is no server room in a restaurant: you need to have simple small servers. Few people to do the work, so the hardware solution had to be simple as they didn’t have the capacity to troubleshoot. Network had to be simple, too.
Just enough kubernetes. Kubernetes sounds like “cute”, but watch out. A cute small baby bear ends up being a dangerous large animal. Stay lightweight. K3s. Aim at highly recoverable instead of highly available. They embrace the “kube movement”: the open source ecosystem. People want to work with the open source stuff, so it is easier to find people.
Cattle, not pets. Zero-touch provisioning: plug-and-play install. “Wipe” pattern: the capability to remotely wipe nodes back to their initial state. Throw-away pattern: if a device is broken, leave it out of the cluster and ship a replacement. Re-hydrate pattern: encourage teams to send critical data out to the cloud when they can and be able to rehydrate if needed, just like with a new iphone.
On the edge, mirror the cloud paradigms as much as you can. Use containers. The “cattle not pets” paradigm.
2023-09-14
(One of my summaries of the 2023 Dutch foss4g.nl conference in Middelburg). (This was the closing keynote, btw).
Full title: investigating war crimes, animal trafficking and more with open source geospatial data. He works with Bellingcat. An open source investigation journalism initiative.
“Open source” isn’t meant in the software way, but it means journalism sources. And then open sources like map info, databases, online photos. You also have gray areas like leaked materials.
Bellingcat’s first well-known investigation was the MH17 airplane disaster, shot down over Ukraine. In the days after the disaster, social media images and posts started to show up. Geolocatable images of the BUK missile system. Dashcam footage.
Investigating individual photos can take a lot of work. Matching scuff marks on the pavement combined with the number of telephone poles in dashcam images… They managed to piece together the timeline and route of the BUK system. The info was even used in the court case in Den Haag.
There are lots of sources of geographical information. Maps, satellite data. Images. Reflections in windows. Vessel/airplane tracking data. Most/all commercial vessels travel with location transmitters enabled. They are currently investigating Russian grain ships that illegally transport stolen Ukrainian grain by swiching off their transmitter for parts of the route. But with satellite image combined with recent webcam footage of ship enthousiasts in the Bosporus strait….
Some other sources:
PeakVisor, a website for recognizing mountain peaks. You can also use that on photos you want to geolocate…
Satellite (forest) fire data. Villages being burned also show up. And fires due to artillery fire.
Combining public government data with geospatial info about incomes to figure out if the government policy helps poor people more than rich people or vice versa.
Openstreetmap’s data. You can look for “railroad”, “telephone pole”, “street” and “one story house” in Ohio and get some 100 locations to check out visually.
Sentinel-1 satellites survey the earth with radar. In the same frequency used by Patriot missile defence batteries. The radar interference is visible, so you can locate the batteries that way.
The sentinel-2 satellite data allows you to look at (amongst others) forest data. Also in Ukraine. So you can follow the damage done to the forests by the war.
So… geodata has lots of uses. Some of them might surprise you. It might have more uses than you initially think of. It can also help Bellingcat. Bridging the journalism side of things and the geospatial tech knowledge and IT knowledge is what his group at Bellingcat tries to accomplish.
The first in-person hackathon: 14-17 November 2023 in Amsterdam. You can also look at https://github.com/bellingcat for their existing software.
2023-09-14
(One of my summaries of the 2023 Dutch foss4g.nl conference in Middelburg).
Bart is the full-time maintainer of MapLibre since early 2023.
Maplibre is a map-rendering toolkit. Actually, it are two rendering toolkits: one for javascript/web and one for native (android/iphone). Native is the one he maintains.
It renders vector data. The output is also vector tiles. Normally, a map server is used for the tiles, but you can also store tiles locally for offline usage. A server for vector tiles doesn’t need to be a big machine: the layers he demoed were hosted on a raspberrypi in his basement.
Vector tiles need styles. Those are defined in json. This gives you lots of flexibility. Night mode, different renderings for biking or walking, etc.
Rendering is done on the client. This needs a GPU. On the web, it uses webgl. On the client it is OpenGL. Only… apple wants you to use their own “Metal” language. They are currently implementing support for that (he showed a quick demo).
They are funded by sponsors. Several bigger companies (like Meta) sponsor it as it is way cheaper to do it collectively with other companies.
(Part of a series of talks about automatic measurement tools in the province of Zeeland. This one is about collecting all the data from the sensors.)
Originally the data would all be send via LORA, but even there there are multiple standards. But some is send via GPRS. Or https. Or legacy ftp. Some older systems need pulling instead of pushing the data themselves.
So… what about a generic software solution for recieving, transforming and publishing sensor data. This is what they wanted:
No vendor lock-in.
Open source components.
Scalable and highly available. Near real-time.
A generic data model (“quite a challenge”…).
Publication based on open standards. They also use the standard “openapi” method of describing their APIs for easy interoperability.
No archival function. Data stays in the system for two months or so, afterwards it is the responsibility of the client to take care of the long-term storage.
The scalable part of the system is handled with docker containers (easy implementation, you can package a complete stack, devops stuff integration). Originally they developed with NodeJS. Because that was single-treaded they tried out “Go”. But Go isn’t that well-known as NodeJS or Python.
Hosting via kubernetes. Easy scaling. Pay-as-you-use. They use the managed azure kubernetes service, but without using any azure-specific functionality so that they can move if needed.
Internally the workers are organised in “pipelines”. Individual steps connected into one whole.
For testing out simple scripts they made a “generic python worker” that you can start with a short python script as input. Handy for testing without needing to do a complete new deployment.
Citizen participation: he helped with earthquake information for the gas-related earthquakes in the north of the Netherlands. Originally, the data wasn’t really available from the official monitoring instance. A PDF with a historical overview and an API with the 30 latest quakes. A first quick website with an overview became quite popular as it was the only real source of information. It was also used by the province!
It started out on a small server in his home. Then he moved to a VPS. Then the website was mentioned in a big national newspaper and the server was brought down by the traffic.
Later the website was improved. Hosting was done relatively simply with ubuntu, postgres, cron, highcharts, geoserver, jspdf, etc. Gasbevingen portaal.
New functionality is address-based generation of all the relevant data for your own house. Handy for the damage claims that have to happen now. He notices that the lawyers of the oil companies also use the same data from his website now :-)
What changed in the last ten years? The KNMI (the official source of info) is sharing much more information than previous. Though it is aimed at researchers instead of the citizens.
Citizen participation like this can be very attractive where the trust in the government is lower. Don’t make it too complex: we’re nerds and it is easy to go overboard.
There is an ever increasing demand for mobile data and 5G. At the same time, there is an ever increasing resistance against actual new cell towers… As a provider, you can adjust your existing equipment. Using 5G, using more frequencies, etc. But eventually you run against hard limits and need new ones.
Wazir made several analyses to determine the expected extra demand combined with the available supply. For demand, they looked at population density, traffic data, railway station usage, etc.
For supply they started with https://antenneregister.nl : 187k antennas! But
individual antennas should be grouped into “sites”. All antennas on one
building’s roof is one site. So postgis’s ST_within
was used on
buildings. And antennas close to one anonther are probably all on the same
physical cell tower.
The result was an estimated 300-700 extra sites. But…. only 16-36 are from actual capacity problems. The rest are for (mandatory) improving coverage and planned speed improvements.
I gave this talk myself. There will be separate detailed blog posts later on :-)
2023-09-14
(One of my summaries of the 2023 Dutch foss4g.nl conference in Middelburg).
Frits has been working for a long time as a community manager.
Having a unique question helps in creating a community. You have an answer: it helps if the question is unique. Watch out with the words you’re using, though, it is easy to use jargon that few people understand.
What is a community? A group of people with a common goal or purpose. Though what people think it is can differ. If you look at a glass from the top, it looks like a circle. From the side the same glass looks rectangular. It is still the same glass. People can look for very different things in the same community.
Membership of a community might be free (money). But definitively not free in your time. So pick your memberships wisely. And keep this in mind when trying to start a community yourself: don’t start big, but small. See what works, see what doesn’t work.
Incremental changes and improvements can help you grow your community. But don’t ignore the positive effect of those changes on your current members: changes and improvements can help keep them engaged. When a community doesn’t change, it can get stale for your old-time members.
You can look at hobby sport teams for community examples. You have regular trainings, you have regular matches. You also have chores that need doing (like refereeing or manning the cafetaria). And you need a board to steer and manage it all. They have nothing to do with the actual sport “content”, but mostly with the more organisation side of things.
A thing that changes in society as a whole and also in communities; less hierarchy, more network. We’re still used to hierarchies, they’re all around us. Networks have advantages of their own. When you cooperate, you no longer need the proverbial “sheep with five legs” or in your organisation.
So: less “command and control” and more “connect and collaborate”.
Some tips for building communities:
Just do it.
Focus on what unites you.
Make sure you have a unique question.
Have a diverse team.
Look outside your community to gain more context.
Grow slowly.
Continue!
Time is the new money.
Organise the management.
“Shift happens”: adapt or stop.
Open source in the public sector is mostly mission-driven cooperation. Cooperation is the key. Smaller municipalities have to provide the same services as big municipalities but often don’t have the capacity: so cooperation.
Open source: you start with “an” open source project. Hopefully something useful gets build and put on github. The initial customer loves it and other municipalities take notice and are interested. That is where it often goes wrong: adapting open source in a different organisation is hard and often doesn’t work. The scaling doesn’t work.
Having a good idea and an initial open source project and some funding for initial product development: that’s in the project’s initial “happy days”. Next up is the project’s “death valley” where scaling-up, introducing it in other organisations and funding ongoing maintenance costs is hard. Only when you’re through that “death valley” do you get to a happy long-term vital project.
For scaling, you need support and management. Community management, software maintenance. You need a community of users.
He works for the combined Dutch municipalities (“VNG”) and tries to get this community aspect working. Often money is involved: maintenance has to be guaranteed by multiple market parties, for instance. You have to involve the market. The VNG can help provide a legal framework for cooperation.
The requirements for sustainable open source cooperation:
As government, you need to provice commercial-style/enterpreneur-style leadership, which is not inherently natural.
Sustainable financing. You need financing up front, but more importantly for the long-term maintenance.
Scaling is important for a sustainable, long-lived, trustworthy project.
Strategic cooperation. You can only scale an initiative when you have good, solid cooperation with multiple partners.
The cooperation needs to be managed and organised. A bit like a paid maintainer, paid collectively by a apartment building’s tennants. You need a “mission leader”.
“Profit for purpose”, a mission-driven cooperation model.
There was a question “moving to open source means a potentially difficult migration, how do you handle that?” Yes, there’s a migration. But if you now have commercial software, you probably need to contract for new software in a public tender in a few years anyway with the risk, also, of migration. With open source software, you can stick to “your” open source software, so you never need to migrate again!
“Just do it”. Be enterpreneurial about it.
They work for one of the Dutch electrical energy transport companies (“alliander”). They use open source for managing congestion on the network. As a company, they aim at using open source GIS to diversify strategically.
They noticed they needed to make a mindset shift. They needed to think in a different way. Visiting events like foss4gnl and the world-wide foss4g conference helps.
For GIS, interoperability is important. They have multiple web clients, several custom phone apps for the field workers, etcetera. Ideally with single sign on to limit the amount of times you have to log in. “Cloud first” is a management term floating around, which was also good for looking at the IT landscape anew.
Now they have arcgis/esri. Migration has to be done incrementally. They started with geoserver, deployed in AWS in a kubernetes cluster. The data is stored in a postgis database, which they’re slowly starting to fill based on the existing esri data.
Limiting the amount of logins: they want to do that with an “API gateway”. A single point where all the API’s seem to live, which transfers the requests to the various backend servers. Only: this totally doesn’t work with ESRI. One of the reasons they want to move.
Question: “by running on AWS aren’t you swapping ESRI for Amazon, also a big USA commercial firm?” No, by using containerisation and kubernetes, they can theoratically move everything over to another provider like azure.
Tip: if you need to provide access to certain restricted data, you can now easily simply start a new geoserver instance with its own access. You don’t use a single big instance anymore. Paradigm shift.
At his university, there’s research on house building projects. There’s the public Dutch database of all buildings (“BAG”). Buildings can have statuses like “being build” and “being planned” and “in use” and so, but those statuses aren’t always reliable.
Postgis’ ST_ClusterDBSCAN
helps them to cluster buildings per year per
category and for cluster size. Clustering is super fast this way. The
geometries can then be combined.
Afterwards they can combine nearby clusters of new houses over the years: simply look which new clusters are next to the clusters of the year before. This way, you can follow bigger housing projects throughout the years. (And do analysis on them, of course, he showed some nice examples).
2023-08-28
I use docker-compose quite a lot. Most of the python/django stuff we deploy is done with docker-compose (one of the two big ones is in kubernetes already). A while back I moved several “geoservers” to docker-compose. Geoserver is a web mapping server written with java/tomcat. Normally pretty stable, but you can get it to crash or to become unresponsive.
So that’s something for which docker’s health check comes in handy. You
can configure it in docker-compose itself, but I put it in our geoserver’s
custom Dockerfile
as I was making some other modifications anyway:
FROM docker.osgeo.org/geoserver:2.23.1
... some unrelated customizations ...
HEALTHCHECK --interval=20s --timeout=10s --retries=3 --start-period=150s \
CMD curl --fail --max-time 3 http://localhost:8080/geoserver/web/ || exit 1
A simple “curl” command to see if the geoserver still displays its start
page. With a generous --start-period
as geoserver needs quite some time to
start up.
Docker-compose allows for healthchecks, and displays Up (healthy)
in the
“state” column when you call docker-compose ps
. But docker-compose doesn’t actually
restart failed services. For that, you need docker-autoheal as an extra service. At the
core, it consists of a single shell script that asks docker if there are
containers matching the filter health=unhealthy
and optionally
autoheal=true
. If found, they get restarted.
I have a mix of services (geoserver, pgbouncer, nginx) with only the
geoservers having a health check. So I configured autoheal like this in my
docker-compose.yml
:
autoheal:
image: willfarrell/autoheal:1.1.0
tty: true
restart: unless-stopped
environment:
- AUTOHEAL_CONTAINER_LABEL=autoheal
volumes:
- /var/run/docker.sock:/var/run/docker.sock
And the services with healthcheck got the autoheal label:
geoserver:
image: ...
labels:
autoheal: true # <= there's an error here
Autoheal didn’t seem to be working for me. No logs. Well, the geoservers that could need to be autohealed rarely failed, which is good news, but made it harder to see if autoheal was working.
Last week I made some changes that improved the speed for several geoserver
maps. But it also made geoserver as a whole unstable. So I had an Up
(unhealthy)
container. But autoheal didn’t restart it. And there was nothing
in autoheal’s log output.
It turned out that autoheal: true
was the problem. true
needs to be
quoted: autoheal: "true"
, as autoheal searches for the lowercase
value. Just true
gets translated to a capitalized True
(probably a
representation of the boolean value) by docker compose, which autoheal doesn’t
search for.
After quoting the value, autoheal properly restarted misbehaving geoservers when they went belly-up:
geoserver:
image: ...
labels:
autoheal: "true" # <= quoted value works
That took some time to figure out… Especially as there was totally no
output from the autoheal docker. A short message upon startup (echo
"autoheal is running"
) would personally have helped me to be sure the
logging was actually working. I spend quite some time googling and figuring
out whether there was actually something wrong with my logging. That’s why the
tty: true
is in there, for instance.
I hope this blog entry has the right words to help someone else plagued with the same problem :-) A quick note in the README, warning about the quotes, is probably a better solution. I’ve submitted an issue for it.
A win for open source, btw: I could read the source code for the autoheal shell script. That helped me figure out what was going wrong.
Statistics: charts of posts per year and per month.
My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):