As organisations accelerate efforts to deploy AI, access to large volumes of training data has become essential to maintaining a competitive edge. But this raises a fundamental legal question; in the context of the GDPR, how far can businesses go when using datasets that appear de-identified or pseudonymised, particularly when training AI models?
On 4 September 2025, the Court of Justice of the European Union ("CJEU") issued its judgement in European Data Protection Supervisor ("EDPS") v Single Resolution Board ("SRB"), which introduces a potentially significant shift to how this question should be answered. It suggests that in some cases, certain pseudonymised data may fall outside the scope of the GDPR, depending on the recipient's perspective and who holds the means to re-identify the individuals concerned.
This represents a departure from earlier assumptions (at least, at a European level) that such data is personal whenever any party holds the key, which supports a more contextual, relative (and practical) approach to identifiability, which is more in line with the ICO's approach to this topic in the UK (and the ICO has relied on the CJEU's judgement in its appeal against an Upper Tribunal ruling before the Court of Appeal). This shift also arguably aligns with the EU's broader digital strategy to facilitate responsible data sharing.
This position is likely to benefit organisations training AI models, particularly where they receive large datasets from third parties, as it supports the argument in some scenarios that such data may fall outside the scope of the GDPR, provided that the organisation lacks the means to re-identify individuals.
The key takeaway from the judgment is that pseudonymised data may not be considered personal data for a particular recipient, depending on that party's ability to re-identify individuals.
Legal backdrop
As a reminder, personal data is defined under the GDPR - that is, information relating to an identified or identifiable individual.
CJEU case law and subsequent guidance from the European Data Protection Board ("EDPB") has further clarified that data can be personal if any party even a third party holds the means to re-identify individuals (and is reasonably likely to use those means).
For data to be anonymised and fall outside the scope of the GDPR, it must be processed in such a way that individuals are no longer identifiable, taking account of all the means 'reasonably likely to be used' by anyone to identify them.
Pseudonymised data falls somewhere in between personal/anonymous data: it does not directly identify someone (so is often confused with anonymous data) but if it can still it can be attributed to an individual using additional information. For example, an online retailer replacing customer names with customer numbers while keeping a separate list that links each number back to the real customer will still be treated as personal data for purposes of the GDPR, as it will not be considered fully anonymised.
Working out whether data is pseudonymised or fully anonymised is also problematic for organisations with some experts considering that total anonymisation is hard to achieve.
So how does the EDPS case help us DP practitioners?
Background
- In 2017, the European Central Bank announced that Banco Popular Español was failing.
- SRB (an institution set up as part of European measures to regulate banks following the Eurozone crisis) then issued a resolution decision (before deciding on potential compensation for affected shareholders/creditors).
- As part of its process, the SRB collected comments from affected individuals, which included survey-style comments, pseudonymised that data and then shared it with Deloitte.
- SRB tagged each comment with an alphanumeric code (i.e., the 'key' to the data) and removed directly identifiable data prior to sharing the data with Deloitte. In other words, SRB had the ability to re-identify individuals to whom the comments related via the key, whereas Deloitte did not.
- Several individuals lodged complaints with the EDPS, arguing that SRB failed to name Deloitte as a recipient of personal data, contrary to its transparency obligations under Article 15 EUI GDPR (i.e., the version of the GDPR that applies to EU institutions).
- The EDPS concluded that the data shared with Deloitte was pseudonymous (and personal) data, on the basis that SRB could re-identify individuals. Subsequently, SRB was in breach of its transparency obligations.
- SRB successfully challenged the EDPS' decision at the General Court ("GC"). The GC annulled the EDPS' decision that the data was pseudonymous (and personal) on several grounds, including:
- It failed to properly apply the 'content, purpose, and effect' test (applied from previous case law, which requires an assessment of whether the information, by its content, the purpose for which it is processed, or its effects, relates to an identified or identifiable individuals, rather than assuming that data necessarily relates to individuals merely because it concerns their views or opinions; and
- Wrongly held that the question of whether the data was identifiable must be assessed from the perspective of Deloitte (and as Deloitte did not hold that key to that data, it therefore follows that Deloitte was not a recipient of personal data).
We provided an initial summary of the GC's judgement shortly after its publication, with further analysis available here.
So, what happened here?
The CJEU disagreed in part with the GC's reasoning, particularly in relation to how the EDPS assessed whether the data in question related to identifiable individuals. Drawing on the Nowak judgement, the Court confirmed that "the particular nature of personal opinions or views [are] as an expression of a person's thinking... necessarily closely linked to that person" As such, the EDPS was entitled to conclude that the comments provided by affected individuals related to those individuals, without needing to apply the 'content, purpose and effect' test.
The CJEU did uphold the GC's finding that the identifiability of data must be assessed from the perspective of the recipient, stating that:
"Pseudonymisation may, depending on the circumstances of the case, effectively prevent persons other than the controller from identifying the data subject, in such a way that, for them, the data subject is not or is no longer identifiable." [para 86]
Therefore, even where the sender of the pseudonymised data (here, the SRB) holds the key to re-identify that data, that same data may be anonymous to the recipient.
That said, the judgement confirmed that transparency obligations under the GDPR still applied. The SRB retained the ability to re-identify individuals and therefore remained a controller of personal data. As such, it was required to name Deloitte as a recipient under its transparency obligations, even though Deloitte received only de-identified data from its own standpoint.
EU vs UK: a convergence, but not (yet) a merger
This decision very much moves the European position closer to the UK position. To date, previous European case law provided a long‑standing orthodoxy, suggesting that pseudonymous data remained personal data so long as anyone, anywhere, held the key (having regard to the means 'reasonably likely to be used').
Indeed, the draft EDPB guidelines on pseudonymisation even state that:
Pseudonymised data, which could be attributed to a natural person by the use of additional information, is to be considered information on an identifiable natural person, and is therefore personal. This statement also holds true if pseudonymised data and additional information are not in the hands of the same person. (para 22)
This approach has always contrasts with the UK ICO's more pragmatic approach, where the ICO guidance on anonymisation indicates that:
"Pseudonymised data is personal data in the hands of someone who holds the additional information"
In other words, identifiability can be assessed relative to each party, meaning data could be personal for an organisation (to the extent that the organisation holds the 'key' which enables re-identification), but anonymous to another (i.e., to an organisation that does not hold the key).
It appears at first glance, that the EDPS v SRB judgement is more closely aligned with the UK's relative approach to identifiability, which will be relevant for organisations subject to both the EU and UK GDPR. We therefore expect that the EDPB will update its position when it finalises its guidelines. It is of course possible that the EDPB may double down on its current approach in the finalised guidelines, and keep pseudonymised data (subject to limited exceptions relating to 'reasonably likely' analysis) squarely within realm of the personal data definition (which would limit the practical headroom this judgement appears to open).
What this means for processing data and particularly for training AI models
This judgement is likely to be welcomed by DP practitioners and organisations more generally, as it provides some welcome clarity and alignment on UK and EU position. It will also be significant for organisations developing AI models using large volumes of training data. That said, AI use cases may complicate the analysis. In particular, the ability of models to memorise, infer, or reproduce training data may affect whether individuals are in fact identifiable in practice, and therefore whether data can properly be treated as anonymous for GDPR purposes.
Going forward, where an organisation holds the means to re-identify individuals, such as a 'key' the data remains personal data from that organisation's perspective, even if the recipient cannot re-identify anyone. This means that GDPR obligations will still apply to the organisation that holds the key, including, in AI contexts carrying out a data protection impact assessment. However, if an organisation does not hold the key the likely the data might not be considered personal which will no doubt be welcomed by those recipients.
Of course, even if data is not considered personal, regulatory obligations will continue under other frameworks, and organisations that receive data that may be considered anonymous must consider obligations under other frameworks, such as the EU AI Act and Data Act.
While the CJEU's judgment introduces a more practical and contextual interpretation of identifiability, the key question is how this will be challenged or clarified in the AI context, given the global widespread reliance on large volumes of de-identified data to train, test, and validate AI models. Whether the EDPB will update its guidance to reflect this shift remains to be seen.
