A high availability Django setup on the cheap - Roland van Laar

Tags: python, pun, django

(One of the talks at the 22 June 2016 Amsterdam Python meetup)

Roland build an educational website that needed to be high available on a tight budget. He demoed the website. A site for the teacher on his own laptop and a separate page for on the digiboard for the class. The teacher steers the digiboard from his laptop (or an ipad or phone).

As it is used in classrooms, it needs to be really really available. As a teacher, you don’t want to have to change your lesson plan at 8:30. The customer hat three goals:

  • Not expensive.

  • Always up.

  • Reliable.

He had some technical goals of his own:

  • Buildable.

  • Functional.

  • Maintainable.

Always up? Django? You have the following challenges, apart from having a bunch of webservers.

  • Media files. Files uploaded on one server need to be visible on others.

  • Websockets.

  • Database.

  • Sessions.

The setup he chose:

  • Front end (html, javascript, images): cloudflare as CDN, content delivery network. The front end is a single page jquery app. It chooses a random API host for ajax requests.

    It changes API hosts when the API is not responding. But…. when is an API not responding? Some schools have really bad internet, so 10 seconds for a request might be “normal”.

    Don’t make a “ping pong” application that retries all the time. Try every server and then fail.

  • Some Django API servers. The actual django project was easy. Simple models, a bit of djangorestframework. As an extra he used some new postgres features.

  • Two SQL servers in BDR, bi-directional replication, mode. “Postgres async multi master”. It is awesome! It just works! Even sessions are replicated faultlessly.

    Things to watch out for: create a separate replication user on both ends. Also watch out with sequences (auto-increment fields). For django it was easy to get working by configuring the database with “USING BDR” when using such IDs. This takes a little bit longer to create such objects. Alternatively you can UUIDs.

    Backups: oopsie. When postgres goes down, you normally restart it and it rebuilds itself. But in a BDR setup, the sequences don’t work right then. The standard tools don’t work, he had to write a custom script.

    Another drawback. For updating your tables, you need a lock on all database nodes. This means you have downtime. No problem, he’ll just do it early on in the morning in a weekend.

  • He uses csync2 for syncing uploaded files between the hosts. Simply a cronjob on all servers. This is good enough as the updates only really happen in the summer; during the school year nothing changes.

  • Websockets. He uses Tornado plus javascript code for reconnecting websockets. Initial connection for the teacher to connect his laptop with the digiboad is via a short 6-digit number. Internally, a UUID is generated. The UUID is stored in local storage, so reloading the page or restarting a laptop Just Works.

The One Time We Were Down: they switched email providers one time because their original one got much more expensive. But the new provider wasn’t as good and suddenly calls took more than 10 seconds and clients started to fail. It wasn’t that critical as it happened after school time when only one teacher wanted to reset his password. So it was easy to fix.

 
vanrees.org logo

Reinout van Rees

My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):