News - Computational Privacy Group, Imperial College London

News from the Computational Privacy Group at Imperial College London

Best paper award at SaTML 2025

Apr 15, 2025

We were lucky enough to win the Best paper Award at SaTML 2025 with our Systemization of Knowledge (SoK) on Membership Inference Attacks (MIAs) against LLMs.

Yves-Alexandre de Montjoye presenting at ELSA Workshop

Mar 17, 2025

Yves-Alexandre de Montjoye will be presenting at the ELSA Workshop on Privacy-Preserving Machine Learning taking place on 17-21 March, 2025. The workshop brings together researchers and practitioners to discuss recent developments in privacy-preserving machine learning techniques and foster collaboration between attendees.

Yves-Alexandre de Montjoye at Dagstuhl Seminar

Mar 14, 2025

Yves-Alexandre de Montjoye was invited to give a talk at the Dagstuhl Seminar "PETs and AI: Privacy Washing and the Need for a PETs Evaluation Framework."

Achilles’ Heels: Vulnerable Record Identification in Synthetic Data Publishing cited in the International AI Safety Report 2025

Jan 30, 2025

Our paper "Achilles’ Heels: Vulnerable Record Identification in Synthetic Data Publishing" authored by Matthieu Meeus, Florent Guépin, Ana-Maria Creţu, and Yves-Alexandre deMontjoye has been cited in the International AI Safety Report 2025. The report is a joint effort by the UK Government and the Alan Turing Institute, and is the first comprehensive report on the safety of AI systems.

New paper "A scaling law to model the effectiveness of identification techniques" published in Nature Communications

Jan 9, 2025

Our paper "A scaling law to model the effectiveness of identification techniques" authored by Luc Rocher, Julien M. Henrickx and Yves-Alexandre de Montjoye has been accepted for publication in Nature Communications. The study introduces a novel mathematical framework for predicting how identification methods scale with dataset size.

"Correlation inference attacks against machine learning models" cited in an opinion piece by the EDBP

Dec 17, 2024

Our paper "Correlation inference attacks against machine learning models" authored by Ana-Maria Creţu, Florent Guépin, and Yves-Alexandre de Montjoye has been cited in an opinion piece by the European Data Protection Board (EDPB) on certain data protection aspects related to the processing of personal data in the context of AI models.

ChocoLlama: A Flemish AI Model featured in De Tijd

Dec 16, 2024

ChocoLlama, a Flemish AI model developed by Matthieu Meeus and Anthony Raye, was featured in De Tijd.

Distinguished paper award at ACM CCS '24

Oct 18, 2024

Our paper QueryCheetah: Fast Automated Discovery of Attribute Inference Attacks Against Query-Based Systems, co-authored by Bozhidar Stevanoski, Ana-Maria Cretu and Yves-Alexandre de Montjoye has received the CCS 2024 Distinguished Paper Award at ACM CCS 2024!

CPG at ICML 2024

Jul 22, 2024

Matthieu Meeus and Igor Shilov introduced their paper Copyright Traps for Large Language Models at ICML 2024 in Vienna.

London Data Week session at Imperial

Jul 17, 2024

Nataša Krčo and Igor Shilov led a session about exploring the robustness of modern data privacy systems.

CPG at Data For Policy 2024 Conference

Jul 11, 2024

Dr Yves-Alexandre de Montjoye hosted a session on using technology to detect illegal content and assessing the robustness of modern data privacy mechanisms.

Best paper award at the ACM CODASPY '24 conference

Jun 20, 2024

Our paper, Re-pseudonymization Strategies for Smart Meter Data Are Not Robust to Deep Learning Profiling Attacks, co-authored by Ana-Maria Cretu, Miruna Rusu and Yves-Alexandre de Montjoye has received a Best Paper Award at the ACM CODASPY '24 conference!

CPG retreat 2024

Apr 20, 2024

The CPG went on a two-day retreat to South England.

MSc student won thesis prize

Dec 5, 2023

MSc student Xiaoxue (Yolanda) Yang has won The Corporate Partnership Programme Individual Project Prize in Computing Science for her master’s thesis, titled 'End-to-End Correlation Inference Attack Against Machine Learning Models'. Congratulations to her and her supervisors, Florent Guepin, Ana-Maria Cretu, and Yves-Alexandre de Montjoye!

CPG at CNIL Privacy Research Day

Jul 14, 2023

The CPG attended the CNIL Privacy Research Day in Paris in June 2023. Ana-Maria Crețu presented her paper on automated privacy attacks (Querysnout), Shubham Jain presented both papers on perceptual hashing and Florent Guépin presented his paper on correlation inference attacks.

CPG at ACM CCS 2022

Nov 10, 2022

Ana-Maria Cretu and CPG alumnus Florimond Houssiau (currently a postdoc at The Alan Turing Institute) presented their paper “QuerySnout: Automating the Discovery of Attribute Inference Attacks against Query-Based Systems” at the ACM CCS 2022 conference in Los Angeles.

New paper “Expanding the attack surface: Robust profiling attacks threaten the privacy of sparse behavioral data” published in Science Advances

Aug 19, 2022

In a new paper published in Science Advances, Arnaud J. Tournier and Yves-Alexandre de Montjoye propose an entropy-based profiling attack for location data which shows that much more auxiliary information than previously believed is available to re-identify individuals in location data. The results show that individuals are correctly identified 79% of the time in a large location dataset of 0.5 million individuals. The proposed attack is robust to state-of-the-art noise addition and learns time-persistent profiles and their accuracy only slowly decreases over time (linear, roughly 1% per week).

CPG at USENIX Security 2022

Aug 10, 2022

The CPG attended the USENIX Security Symposium in Boston on 10-12 August 2022. Ana-Maria Cretu and Andrea Gadotti presented their papers on evaluating the robustness of perceptual hashing-based client-side scanning systems and on pool inference attacks against Apple's Count Mean Sketch, respectively.

New USENIX paper: “Pool Inference Attacks on Local Differential Privacy: Quantifying the Privacy Guarantees of Apple's Count Mean Sketch in Practice”

Aug 10, 2022

In their new paper, Andrea Gadotti, Florimond Houssiau, Meenatchi Sundaram Muthu Selva Annamalai, and Yves-Alexandre de Montjoye, investigate the practical guarantees of Apple’s implementation of local differential privacy in iOS and macOS. They propose a new type of attacks, called pool inference attacks, where an adversary has access to a user’s obfuscated data, defines pools of objects, and exploits the user’s polarized behavior in multiple data collections to infer the user’s preferred pool. The results show that pool inference attacks are a concern for data protected by local differential privacy mechanisms with a large ε — such as Apple’s Count Mean Sketch mechanism —, emphasizing the need for additional technical safeguards and the need for more research on how to apply local differential privacy for multiple collections.

Our work was featured in John Oliver’s Last Week Tonight

Apr 11, 2022

Our work was featured in Last Week Tonight with John Oliver in their episode on data brokers. The paper appears at 11:31 and features a statistic from “Estimating the success of re-identifications in incomplete datasets using generative models”, namely that 99.98% of Americans can be correctly identified in any dataset using 15 demographic attributes.

Yves-Alexandre de Montjoye at Digital Regulation Co-operation Forum (DRCF) E2EE roundtable

Jan 27, 2022

On January 27th, 2022, Yves-Alexandre participated in a roundtable of the Digital Regulation Co-operation Forum (DRCF) on perceptual hashing-based client-side scanning and presented CPG’s work on adversarial detection avoidance attacks.

New paper “Interaction data are identifiable even across long periods of time” published in Nature Communications

Jan 25, 2022

A new Nature Communications paper by Ana-Maria Crețu, Federico Monti, Stefano Marrone, Xiaowen Dong, Michael Bronstein, and Yves-Alexandre de Montjoye reveals that data about people’s interactions can be used to identify individuals in anonymous datasets. The paper shows that the learned profiles are stable and that people’s behavior is still identifiable over a long period of time. The results provide strong evidence that disconnected and even re-pseudonymized interaction data can be linked together making them personal data under the European Union’s General Data Protection Regulation (GDPR).

New Nature Communication paper: “On the difficulty of achieving Differential Privacy in practice: user-level guarantees in aggregate location data”

Jan 10, 2022

Florimond Houssiau, Luc Rocher, and Yves-Alexandre de Montjoye show in this short paper that the privacy guarantees given by Google for their shared aggregated data from 300M Google Maps users to be incorrect.

Contributed talk at the ACM CCS Privacy-Preserving Machine Learning workshop (PPML 2021)

Nov 19, 2021

Our workshop paper titled “Interaction data are identifiable even across long periods of time” (a long version of which will be published soon in Nature Communications) was accepted as a *contributed talk* at the PPML 2021 workshop. A recording of the talk given by lead author Ana-Maria Creţu is available on Youtube.

Ana-Maria Crețu and Shubham Jain at the Conference on Applied Machine Learning in Information Security (CAMLIS)

Nov 5, 2021

Ana-Maria Crețu, Shubham Jain, and Yves-Alexandre de Montjoye’s paper on the robustness of perceptual hashing-based client-side scanning to detection avoidance attacks was selected for an *oral presentation* at the Conference on Applied Machine Learning in Information Security (CAMLIS 2021). Ana-Maria gave the talk on Nov 4, 2021.

New USENIX Security paper: “Adversarial Detection Avoidance Attacks: Evaluating the robustness of perceptual hashing-based client-side scanning”

Sep 28, 2021

In their new paper due to appear at USENIX Security 2022, Shubham Jain, Ana-Maria Crețu, and Yves-Alexandre de Montjoye showed perceptual hashing-based client-side scanning mechanisms to be highly vulnerable to detection avoidance attacks. The paper proposes a general black-box attack and demonstrates that >99.9% of images can be successfully modified while preserving the image content.

Shubham Jain at the Hot Topics in Privacy Enhancing Technologies (HotPETs)

Jul 16, 2021

At the HotPETs 2021 workshop at Privacy Enhancing Technologies Symposium (PETS), Shubham Jain presented his joint work with Ana-Maria Crețu and Yves-Alexandre de Montjoye on vulnerabilities of perceptual-hashing client-side scanning mechanisms to detection avoidance attacks.

CPG paper “Unique in the Crowd” mentioned in MTD to the US district court for the Central District of California on e-scooters location data

Jun 9, 2021

CPG paper, “Unique in the Crowd: The privacy bounds of human mobility” was mentioned in a motion to dismiss (MTD) to the US district court for the Central District of California in “Justin Sanchez, et al. v. Los Angeles Department of Transportation, et al.”

“The risk of re-identification remains high even in country-scale location datasets” published in Cell Patterns

Mar 12, 2021

“The risk of re-identification remains high even in country-scale location datasets” by Ali Farzanehfar, Florimond Houssiau, and Yves-Alexandre de Montjoye appeared today in Cell Pattern. The paper measures, mathematically models, and provides a lower bound on the relationship between the size of a dataset and the risk of re-identification as measured by unicity. The results show that the risk of re-identification decreases very slowly with increasing dataset size, contradicting previous claims.

Andrea Gadotti in a Computerphile video

Jan 22, 2021

Andrea was invited by Computerphile to present the anonymity problems in location data. The full video is available on YouTube.

Evaluating COVID-19 contact tracing apps? Here are 8 privacy questions we think you should ask.

Apr 2, 2020

While governments are ramping up their efforts to slow down the spread of COVID-19, contact tracing apps are being developed to record interactions and warn users if one of their contacts is later diagnosed positive. These apps could help avoid long-term confinement, but also record fine-grained location or close-proximity data. In this blog post, we propose 8 questions one should ask to understand how protective of privacy an app is.

Can we fight COVID-19 without resorting to mass surveillance?

Mar 21, 2020

Used correctly, mobile phone data could help monitor the effectiveness of lockdown measures and track contacts of people who have been tested positive. We've been asked if the data could be collected and used effectively without enabling mass surveillance. This is our response.

WEF 2020 panel: Can AI and Privacy co-exist?

Jan 15, 2020

Yves-Alexandre is organizing a panel in Davos on ‘Europe’s digital leadership: can AI and privacy co-exist?’

New USENIX Security paper: "When the Signal is in the Noise: Exploiting Diffix's Sticky Noise"

Aug 15, 2019

Our noise-exploitation attack against Aircloak's Diffix system was presented by Andrea and Luc at USENIX Security 2019! The paper is available on the USENIX website, together with the slides and the video from the presentation. For more details about this paper, you can read our blog post and an article on TechCrunch.

New Nature Communication paper on the risks of re-identification in incomplete datasets

Jul 23, 2019

In a new paper published in Nature Communications, Luc and Yves-Alexandre show how the incompleteness of datasets does not provide plausible deniability to participants. Contradicting previous claims, they show that sampling does not decrease the risk of re-identification.

Andrea Gadotti at Westminster for Evidence Week

Jun 28, 2019

On 17 June 2019, Andrea Gadotti presented CPG’s research at Westminster as part of the "In conversation with the National Statistician" event. On 26 June, he presented at the Evidence Week, organised by Sense About Science.

Ali Farzanehfar at European Commission for session on anonymization

Jun 26, 2019

On 26 June 2019, Ali Farzanehfar was invited to the European Commission in Brussels to present the group's work on data anonymization. The event was part of the European Commission’s Connect Summer School (DG CONNECT) on pressing matters in the modern world.

Andrea Gadotti to present at Germany's Data Ethics Commission in Berlin

May 26, 2019

On 9 May 2019, Andrea Gadotti was invited to the Ministry of the Interior in Berlin to speak at the round table meeting of the Data Ethics Commission. The meeting was live streamed on the Ministry's website and is now available on YouTube (Andrea's presentation starts at min 27:56).

New WWW Demo Paper: "UNVEIL: Capture and Visualise WiFi Data Leakages"

Feb 14, 2019

Our work on capturing and visualising the data leaked by mobile devices's WiFi has been accepted at The Web Conference 2019 (WWW ‘19).

'Data is a fingerprint': why you aren't as anonymous as you think online

Jul 13, 2018

Olivia Solon writing for the Guardian on the (sharp) limits of data anonymization and ways forward includes quotes from CPG group leader, Yves-Alexandre de Montjoye.

When the signal is in the noise: Exploiting Aircloak's Diffix anonymization mechanism

Apr 24, 2018

We studied Diffix, a system developed and commercialized by Aircloak to anonymise data by adding noise to SQL queries sent by analysts. In a manuscript we just published on arXiv, we show that Diffix is vulnerable to a noise-exploitation attack. In short, our attack uses the noise added by Diffix to infer people’s private information with high accuracy. We share Diffix’s creators opinion that it is time to take a fresh look at building practical anonymization systems.

Cambridge Analytica is only the beginning. Should you blame your friends for it?

Mar 29, 2018

Recent revelations from Cambridge Analytica show how vulnerable our privacy is to seemingly innocuous apps installed by our friends. We here show how our privacy is affected by people we interact with. Node-based intrusions are becoming one of the main threat to our privacy.

Solving AI's Privacy Problem

Feb 16, 2018

Artificial Intelligence (AI) has potential to fundamentally change the way we work, live, and interact. There is however, no general AI out there and the accuracy of current machine learning models largely depend on the data on which they have been trained. For the coming decades, the development of AI will depend on access to ever larger and richer medical and behavioral datasets. We now have strong evidence that the tool we have used historically to find a balance between using the data in aggregate and protecting people’s privacy, de-identification, does not scale to big data datasets. The development and deployment of modern privacy-enhancing technologies (PET), allowing data controllers to make data available in a safe and transparent way, will be key to unlocking the great potential of AI.