(One of my summaries of a talk at the 2018 european djangocon.)
Will is a software developer with a law degree. Now that we have the GDPR, his law degree is suddenly very relevant. GDPR takes effect on 25 May 2018.
What is the GDPR? It is a law that regulates the use of personal data.
You’ll probably have had lots of emails from companies telling you that they’ll be good with your data and asking whether they’re still allowed to use it.
He encourages you to read the actual regulation. The first part is quite readable. The actual articles are quite detailed, but only the first 34 are relevant for us. He thinks we have a professional duty to be on top of this. We have to know about it.
As programmers, we’re in the front line. We might be the ones that can best advise the company on how to comply. We ought to know the details. If you help your company, you’re valuable to your company, so…
He has three categories in his talk: terms, rights, tasks.
By design and default. Learn to do it properly. If you work with django, follow recommended django practices and feel that you’re behaving yourself, you’re probably OK.
Important here is “data minimalization”. Don’t pass along full user objects to other systems. Even not the userid. Generate a UUID or so.
Separate personal data completely. “Pseudo-anonymization”.
For a medical database, does your database support staff need to see a person’s name? No. Only the doctor needs to know that. Then you might be better off encrypting the name.
Erasure. Can you split the backups? A separate one for personal data and one for the rest? That might make zapping personal data easier.
No discrimination. You cannot discriminate with prices on areas where people live, anymore. If you have algorithms that make decisions, watch out for biases.
Note: gender and age are not included here! So special prices for older or younger people are fine. But, again, watch out for indirect discrimination. There are other laws that you have to take into account.
(See my summary of the great talk on biases)
Your algorithms will get better because of it.
Explain machine learning. If you make an automatic decision, you might have to explain it. If it is an unclear pile of a neural net, it might be hard to explain…
Anonymization. True anonymization is rare. And hard. The answer you have to ask is “is reidentification reasonably likely”. And as a programmer, you’re probably the only person that can answer it.
Again, anonomyzation is hard. You’ll probably have to get outside expert help.
Breach notification. If there is a breach, you have to report it. Otherwise you are liable. Even putting too many people in an email’s CC field could be a breach…
What could django do?
The current situation isn’t clear yet. In a few years it probably will be.
Photo explanation: constructing a viaduct module (which spans a 2m staircase) for my model railway on my attic.
My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):