Under the GDPR Recital 26, anonymous information is ‘information which does not relate to an identified or identifiable natural person’ or ‘personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable’. Anonymous personal data is outside the scope of the GDPR, thus making its rules inapplicable. This explains why decent anonymisation techniques are indeed in great demand in different companies and organisations.
With that said, the GDPR does not itself explain how to make the data subject ‘no longer identifiable’ and what really stands behind this concept.
When it comes to ‘soft law’, it was in constant change over the years, with the Working Party 29 (WP29) taking somewhat different approaches to anonymisation in 2007 and 2014 and the EDPB occasionally sharing only ‘obiter dictum’ pieces of explanations. In turn, national supervisory authorities were taking contradictory stances prone to either 2007 or 2014 approaches.
WP29’s changing approach
In its Opinion 4/2007 ‘On the concept of personal data’, WP29 took a simplified and high-level approach to how personal data might be successfully anonymised. It says that to create anonymised data, ‘disguising identities’ ‘done in a way that no reidentification is possible, e.g. by one-way cryptography’ becomes a workable solution. To find out whether or not identification is possible, ‘all the means likely reasonably to be used either by the controller or by any other person to identify the said person’ should be taken into account. If ‘that possibility does not exist or is negligible, the person should not be considered as “identifiable”, and the information would not be considered as ‘personal data’. Thus, the 2007 WP29 opinion left data controllers a significant degree of flexibility when choosing and implementing anonymisation techniques.
However, in 2014, WP29 took a much more rigid approach. In its Opinion 05/2014 “On Anonymisation Techniques”, WP29 provides the following explanation:
Thus, it is critical to understand that when a data controller does not delete the original (identifiable) data at event-level, and the data controller hands over part of this dataset (for example after removal or masking of identifiable data), the resulting dataset is still personal data. Only if the data controller would aggregate the data to a level where the individual events are no longer identifiable, the resulting dataset can be qualified as anonymous. For example: if an organisation collects data on individual travel movements, the individual travel patterns at event level would still qualify as personal data for any party, as long as the data controller (or any other party) still has access to the original raw data, even if direct identifiers have been removed from the set provided to third parties. But if the data controller would delete the raw data, and only provide aggregate statistics to third parties on a high level, such as ‘on Mondays on trajectory X there are 160% more passengers than on Tuesdays’, that would qualify as anonymous data.
In other words, simple ‘disguising identity’ became no longer enough to render the data anonymous. To get the data out of the GDPR scope, an organisation needs to turn it into aggregate statistics and delete the original raw data. As the WP29 outlines, ‘[…] removing directly identifying elements in itself is not enough to ensure that identification of the data subject is no longer possible. It will often be necessary to take additional measures to prevent identification […]’. As a general approach, WP29 stresses that for the data to be truly anonymised, ‘that data should be such as not to allow the data subject to be identified via “all” “likely” and “reasonable” means’.
In the 2014 opinion, WP29 also admits that the results of anonymisation might still pose residual risks of re-identification as ‘research, tools and computational power evolve’. With this in mind, ‘anonymisation should not be regarded as a one-off exercise’, and these risks ‘should be reassessed regularly by data controllers’. Where new or residual risks are present, a data controller should ‘assess whether the controls for identified risks suffice and adjust accordingly’. Thus, WP29, in fact, makes it clear that ‘true’ anonymisation implies little-to-no residual risk for the data subjects to be re-identified.
National supervisory authorities seem to also take different approaches to the goals and nature of anonymisation.
In its ‘Anonymisation, pseudonymisation and privacy enhancing technologies guidance’, Chapter 1 and Chapter 2 (draft), the UK’s ICO outlines that, to achieve ‘effective anonymisation’, ‘you need to reduce the risks of identifying individuals to a sufficiently remote level’ and that the ‘residual risk does not mean that particular [anonymisation] technique is ineffective’. Thus, the ICO is prone to the flexible approach to anonymisation that is inherently risk-based.
Irish supervisory authority (DPC), in its Guidance on Anonymisation and Pseudonymisation, like WP29 in 2014, takes a strict approach towards keeping the raw data: ‘If the data controller retains the raw data, or any key or other information which can be used to reverse the ‘anonymisation’ process and to identify a data subject, identification by the data controller must still be considered possible in most cases’. At the same time, the DPC seems to allow for a certain degree of the residual risk: ‘Even where anonymisation is undertaken, it does retain some inherent risk. […] And finally, even where effective anonymisation can be carried out, any release of a dataset may have residual privacy implications, and the expectations of the concerned individuals should be accounted for’.
The France’s privacy watchdog (CNIL), finally, seems to have taken the strictest approach towards anonymisation, saying that ‘the data controller who wishes to anonymize a data set must demonstrate, via an in-depth assessment of the identification risks, that the risk of re-identification with reasonable means is nil’.
AEPD-EDPS joint opinion
Amid diverse approaches towards anonymisation, the paper ’10 misunderstandings related to anonymisation’ jointly issued by the European Data Protection Supervisor (EDPS) and the Spain’s supervisory authority (AEPD) might look as taking a balanced approach.
They admit that, ‘except for specific cases where data is highly generalised (e.g. a dataset counting the number of visitors of a website per country in a year), the re-identification risk is never zero’. With this in mind, anonymisation is understood as ‘a process that tries to find the right balance between reducing the reidentification risk and keeping the utility of a dataset for the envisaged purpose(s)’.
For the anonymisation process to be sufficiently robust, it should aim ‘to reduce the re-identification risk below a certain threshold. Such threshold will depend on several factors such as the existing mitigation controls (none in the context of public disclosure), the impact on individuals’ privacy in the event of reidentification, the motives and the capacity of an attacker to re-identify the data’.
Thus, the AEPD and EDPS appear to be prone to the UK’s approach, not requiring anonymisation to be absolute. They admit that it would be wrong to just label data as ‘anonymous’ or not as this is not a binary concept. There is always an ‘extent’ of anonymisation and the re-identification risk which might change and ‘which should be managed and controlled over the time’.
What did EDPB say?
In the Guidelines 04/2020 “On the use of location data and contact tracing tools in the context of the COVID-19 outbreak”, the EDPB dedicated several paragraphs of text to anonymisation analysed in the context of tracing tools and location data:
Anonymisation refers to the use of a set of techniques in order to remove the ability to link the data with an identified or identifiable natural person against any “reasonable” effort. This “reasonability test” must take into account both objective aspects (time, technical means) and contextual elements that may vary case by case (rarity of a phenomenon including population density, nature and volume of data). If the data fails to pass this test, then it has not been anonymised and therefore remains in the scope of the GDPR.
Translating this wording into the AEPD-EDPB language, it might be concluded that if the anonymised set of data stands against “reasonable” effort, then it probably means that the risk of reidentification is ‘reduced below a certain threshold’, thus rendering the data anonymous.
However, unlike the AEPD-EDPB paper, the EDPB does not expressly address the issue of residual risks of reidentification, and this matter remains unclear.
It would be fair to say that, after all, the privacy industry is waiting for a more comprehensive and thought-out EDPB’s position to be clearly stated, which would summarise all approaches and concerns voiced previously.
Until then, data controllers seem to have several ways of how anonymisation might be approached:
(i) To refuse to anonymise data amid legal uncertainty;
(ii) Taking into account opinions of local supervisory authorities and where possible, to adopt robust anonymisation techniques and apply them on a case-by-case basis, arguing that the risks of identifying individuals are reduced to a sufficiently remote level. This approach means that a controller will need to evaluate the risks of re-identification – which, as we know, are never zero and which might change over the time. This clearly leads companies to yet another dynamic ‘impact assessment’. Being often overwhelmed with other ‘impact assessments’ (DPIA, Transfer Impact Assessment, Legitimate Interest Assessment, etc.), this may question the feasibility of anonymisation as such;
(iii) Explore the functionality of synthetic data. E.g., in 2020, France’s privacy watchdog (CNIL) approved the anonymisation technique built on the AI-based method of avatars;
(iv) Entering into a WP29-approved trusted third party (TTP) arrangement – ‘in situations where a number of organisations each want to anonymise the personal data they hold for use in a collaborative project’.