LLM's unintended memories
A year ago, ChatGPT surprised the world with its extraordinary language generation capabilities. Chatbots have since become one of the fastest adopted consumer products in history with investments in genAI forecasted to reach $12B this year.
In this talk, I will first review the fast-evolving literature on the document-level membership inference task for LLMs: the methods proposed to detect--a posteriori--whether a specific piece of text was seen during pre training or fine-tuning by an LLM and at least partially memorized, the distribution shift concerns, and some of the solutions proposed.
I will then discuss the use of randomized controlled setups to both study LLM memorization and infer membership using copyright traps. In particular, I will discuss how randomized controlled setup has shed light on the determinant of memorization and how unique and synthetically generated trap sequences can be injected into content to enable membership inference.
I will conclude the talks with some thoughts on the security and privacy challenges ahead when it comes to LLMs.
Bio: Yves-Alexandre de Montjoye is an Associate Professor of Applied Mathematics and Computer Science at Imperial College London where he leads the Computational Privacy Group. He is currently serving as a Special Adviser on AI and Data Protection to E.C. Justice Commissioner Reynders. His research has been published in journals such Science, PNAS, and Nature Communications and conferences such as ICML, IEEE S&P, and ACM CCS and has enjoyed wide media coverage (e.g. BBC, CNN, New York Times, Wall Street Journal, Harvard Business Review). His work on the shortcomings of anonymization have been widely influential appearing in reports of the World Economic Forum, FTC, European Commission, and the OECD. Yves-Alexandre worked for the Boston Consulting Group and was a Special Adviser to E.C. Competition Commissioner Vestager co-authoring the Competition Policy for the Digital Era report. He received his PhD from MIT in 2015 and obtained an M.Sc. from UCLouvain in Applied Mathematics, an M.Sc. (Centralien) from Ecole Centrale Paris, an M.Sc. from KULeuven in Mathematical Engineering as well as his B.Sc. in engineering from UCLouvain.
Practical information
- General public
- Free
Organizer
- Professor Carmela Troncoso
Contact
- Professor Carmela Troncoso