If you are working with software development as a developer, manager or tester, then you will be impacted by the General Data Protection Regulation (GDPR) – the new EU laws regarding data privacy. In many ways, the regulation is likely to have as big of an impact as the Y2K problem. But this time it’s because of a good cause! And you cannot ignore it, as the fines for doing so can be crippling. But for most people who find themselves face-to-face with GDPR it’s quite intimidating.
I hope this article can do something about that.
First, let’s say a few words about how to approach the regulation itself and then let’s look at some actions you should consider. I find the text of the regulation to be surprisingly well-written, but I wish I had a reading guide when I started out.
The actual text is officially published on http://ec.europa.eu. The text starts with 173 “recitals”. These are background considerations for the regulation. Even though they are well thought out and well written, they are not very much to the point. They are also often quite heavy to read, even if it’s for good reason (a perfect example is Recital 38 which talks about the data protection concerns of children). After the recitals are the actual 99 articles of the law, which are much easier to read. These are divided into chapters and many of the later sections are about the structure for enforcement, not for the actual regulation as it impacts most organizations.
Instead of reading the PDF, I recommend looking at https://gdpr-info.eu, which has organized the regulation in an easy-to-use structure. If you are responsible for an IT system, you need to understand at a minimum articles 1 through 50 and especially articles 5 through 35. Start by reading these.
Let’s look at some of the ways you may be surprised.
Surprise 1: Consent and test data
In order to collect and use personal data, you need to have permission to do so (article 6). For most people, this either means that you are required by law to use the information (for example medical information in a public health context) or that you have obtained consent from the data subject. This has actually been the case for a while, but some things will be more explicitly required: First, you must make it clear what your user is consenting to and second, you must make it optional (and non-default!) to give consent in cases where it’s not needed to provide a given service.
One very common scenario is for organizations to use production data from their customers for testing purposes. You can forget about that in the future. You could ask your customers for consent to use their data for testing purposes, but you must make this consent optional and non-default. So that means that at best, you need to find good routines to extract only data for consenting customers. Good luck! You could anonymize the data, but you are liable if there is a risk of reidentification.
Instead, I recommend that you invest in other testing strategies. In particular, most testing organizations can improve a lot by creating synthetic data. By investing in synthetic data, you can also improve your ability to stress the system with large and unusual values. Another testing method is partial production, where you gradually let more users onto a new version of the system. Perfecting this will also improve your delivery cycle a lot.
Now you have an excuse to invest in better testing.
Surprise 2: Functions required to support the rights of the data subject – data portability
When you store personal data, you now have to plan for functionality where the subject of this data can exercise their rights regarding the data. Much of this has already been the case, but the existing rights have been strengthened. This is described in articles 12 through 23. Just add them to your product backlog as system functionality or manual processes that must implemented. Basically, your customers have the right to see what data you are storing about them, including who has accessed the data. They should be able to correct errors in the data and to ask for data to be deleted.
A new and very interesting right is the right to data portability (article 20). Your customers have the right to get their data from you and take it to your competitors in a portable format (“structured, commonly used and machine readable”). As I understand it – if your customers can get an exported JSON, XML or CSV-format with everything concerning them, you’re pretty much set. If you wonder: PDF is not good enough.
Data portability is one of the most exciting parts of the GDPR and seems to be motivated by a desire to promote innovation. Just imagine what you can do with it! Imagine an app that can import your purchase history from all major store chains and help you analyze your own buying habits. It’s long been a dream, but the data was locked up. Until now!
Or you can use it to keep your competitors honest! Tell your customers that if they give you consent and upload their data from your competitors, you give the discounts and prizes. You have to be prepared for the case that your customers withdraw their consent again, but you are still allowed to keep aggregated data. And if you’re competitors are laid-back about GDPR – well, you can really put their feet to the fire!
Surprise 3: Keep data safe during transfer and storage – everywhere!
The final important consideration I want to talk about that comes out of GDPR is keeping data safe and under control (article 32 and also article 25, which mandates Privacy by Design). You are required by law to protect personal data in transit and rest at all locations. You are also required to report to the local authorities any breach of data that you detect.
What you need to do now is to map out every place you transfer personal data. What about temporary storage areas? What about backups? What about third parties that receive data? And what about logs?
Most organizations have less access protection of application logs than any other data. And often you log without considering the contents of the logging. I used to practice an approach of logging the full payload of all incoming and outgoing communication messages. That may no longer be a good idea.
The easiest way to protect data is to make sure you never collect it, and failing that, making sure that you don’t store it unnecessarily. You need to trace every piece of personal data through your systems and find out who can access it. If you can collect less, that will make your job easier.
There are more aspects of the GDPR that I have not discussed. In particular, the role of the Data Protection Officer is critical and has some surprising nuances.
But the goal of this article was to give you a place to start where there’s a good chance that you have something you need to do and perhaps something that you can benefit from as well. Here are three suggestions: 1. Verify that you’re not using production data for testing and develop new testing techniques if you do, 2. Add development tasks to support the rights of the data subject – a good place to start is Data portability, 3. Analyze the flow of personal data through the system, especially looking for less secured storage locations like logs and temporary files.
It’s not long until the law comes into effect and you need to get ready! The three places I’ve pointed out where you need to start touch most of the IT system related aspects of GDPR and can be used as a basis for understanding, implementing and benefiting from our improved rights to our own data as citizens and individuals!