Stockholm university

Research project Privacy-Preserving Techniques for Large Language Models

Recent breakthroughs in AI have been driven mainly by large language models. While they can be very useful, they also threaten privacy – they leak private information. This project aims to identify these risks and develop privacy-preserving techniques.

A knight in armour in a fantasy landscape.
Image: Thomas Vakili (generated with Open AI’s Dall-e 2).

Large language models (LLMs) have led to impressive breakthroughs in artificial intelligence (AI) and natural language processing (NLP). LLMs consist of enormous amounts of parameters trained to process human language. This learning is achieved by processing vast amounts of text.

Multiple studies have shown that LLMs memorize information in their training data that can then leak. These privacy issues are worsening as LLMs grow and consume more training data. The risks are especially dire in domains where data are sensitive, such as the clinical domain. At the same time, these are the domains where AI can have the most beneficial societal impact. This project aims to study the threats to privacy that come with LLMs and investigate privacy-preserving techniques to mitigate these risks. Doing so is crucial if LLMs are to be used in an ethical and legal manner.

This is Thomas Vakili’s PhD project. His main supervisor is Hercules Dalianis, and his co-supervisor is Aron Henriksson.
 

Project members

Project managers

Thomas Vakili

PhD student

Department of Computer and Systems Sciences
Thomas_Vakili_2022

Hercules Dalianis

Professor

Department of Computer and Systems Sciences
Hercules Dalianis

Aron Henriksson

Associate professor

Department of Computer and Systems Sciences
Aron Henriksson

Publications