European regulators’ views on AI models: Anonymity, legitimate interests, and liability under the GDPR
Posted: January 20, 2025
The European Data Protection Board (EDPB) has published an opinion about how the General Data Protection Regulation (GDPR) treats AI models.
In Opinion 28/2024, the EDPB evaluates when an AI model can be considered anonymous, considers whether “legitimate interests” is a suitable legal basis for various AI-related activities, and explores some scenarios involving the development and use of unlawfully trained AI systems.
When are AI models anonymous?
The GDPR provides a broad definition of “personal data” and a high bar for anonymity. The EDPB opinion provides some criteria for determining whether an AI model should be considered anonymous.
AI models trained with personal data cannot be assumed anonymous by default; they may still retain information that relates to identifiable individuals.
The EDPB says that whether a model is anonymous depends on assessing the likelihood of two things:
- The possibility of directly extracting personal data from the model’s parameters or structure.
- The possibility of indirectly inferring personal data through interactions or queries to the model.
There are two main ways in which personal data might be extracted from an AI model, according to the EDPB:
- Vulnerabilities like membership inference attacks or model inversion attacks, which can expose personal data
- The unintended regurgitation of training data during outputs or responses
The EDPB says that national regulators should assess any claim that an AI model is anonymous on a case-by-case basis, considering:
- The means reasonably likely to be used by potential attackers to extract or infer data.
- Technological advancements that could impact the feasibility of data extraction.
To support any claim of anonymity, AI model providers must provide evidence such as audits and risk assessments that demonstrate the steps taken to ensure that the AI model does not process personal data.
‘Legitimate interests’ and AI models
The EDPB opinion explores whether controllers can rely on “legitimate interests” for various activities throughout the AI lifecycle, including training the AI model and fine-tuning the model post-deployment.
“Legitimate interests” is the most flexible legal basis for processing personal data under the GDPR as it does not require the controller to obtain opt-in consent or identify a legal requirement, contractual obligation, or public interest in the processing.
The EDPB does not rule out the possibility of relying on legitimate interests in the context of developing an using AI. However, the board sets a relatively high bar for compliance.
Before relying on legitimate interests, controllers must conduct a “legitimate interest assessment” to determine whether the legal basis is appropriate. The EDPB provides some analysis of how a legitimate interests assessment might work in the context of AI.
- Step 1: Identify the legitimate interest: The interest must be lawful, clearly defined, and real (not speculative). In the context of AI models, legitimate interests might include improving fraud detection, developing conversational agents, or enhancing threat detection.
- Step 2: The “necessity” test: Assess whether the processing is necessary to achieve the legitimate interest, consider if less intrusive methods could achieve the same result, and ensure the data processing complies with the data minimization principle. For example, there is no need to use an AI model that involves processing personal data if you can achieve your purposes using an anonymous AI model
- Step 3: The “balancing test”: Evaluate whether the legitimate interest outweighs the data subject’s rights and freedoms. taking into account potential risks and the nature of the data being processed. A controller might be able to tip the balance in their favor by implementing technical measures to prevent the misuse of an AI system.
Unlawfulness and AI models
Finally, the EDPB sets out three scenarios involving the unlawful deployment and development of AI models.
In the first scenario, personal data that was unlawfully processed during the development or training phase of an AI model is embedded in the model’s parameters and is subsequently used by the same controller during deployment.
In this scenario, a regulator could order the controller to delete the training data and the model itself as the unlawful development of the AI model could also render its deployment unlawful.
In the second scenario, personal data that was unlawfully processed during the development or training phase of an AI model remains embedded in the model’s parameters, and a different controller acquires and uses the model during deployment.
Even though the second controller did not engage in the initial unlawful processing, it is still responsible for ensuring the model’s compliance with GDPR. If the controller fails to that the model has been developed unlawfully, this in itself could constitute a GDPR violation.
In the third scenario, an AI model is developed using personal data that was unlawfully processed, but the data is later anonymized before the model’s deployment. Subsequent processing of personal data may occur during the deployment phase.
If data used during the training phase is effectively anonymized, the model may no longer be subject to GDPR regarding that data. If the controller can confirm that the model has been appropriately anonymized, they will not violate the GDPR merely by deploying the model.
A rigorous approach to due diligence
The EDPB’s opinion sets a high bar for GDPR compliance—both for those developing AI models and those using them.
Because the GDPR applies only to the processing of personal data, the EDPB makes clear that anonymous AI models are not within the scope of the law. However, determining that an AI mode is indeed anonymous will require careful assessment, testing, and due diligence on the part of the controller.
The scenarios presented towards the end of the opinion suggest that this principle extends even when using an AI model that has been trained unlawfully on personal data—as long as the controller can demonstrate that the AI model has been effectively anonymized.
Note, however, that other laws apply to the use of AI, including consumer protection laws and the recently passed EU AI Act.