News - Computational Privacy Group, Imperial College London

News from the Computational Privacy Group at Imperial College London










New paper “Expanding the attack surface: Robust profiling attacks threaten the privacy of sparse behavioral data” published in Science Advances

Aug 19, 2022

In a new paper published in Science Advances, Arnaud J. Tournier and Yves-Alexandre de Montjoye propose an entropy-based profiling attack for location data which shows that much more auxiliary information than previously believed is available to re-identify individuals in location data. The results show that individuals are correctly identified 79% of the time in a large location dataset of 0.5 million individuals. The proposed attack is robust to state-of-the-art noise addition and learns time-persistent profiles and their accuracy only slowly decreases over time (linear, roughly 1% per week).



New USENIX paper: “Pool Inference Attacks on Local Differential Privacy: Quantifying the Privacy Guarantees of Apple's Count Mean Sketch in Practice”

Aug 10, 2022

In their new paper, Andrea Gadotti, Florimond Houssiau, Meenatchi Sundaram Muthu Selva Annamalai, and Yves-Alexandre de Montjoye, investigate the practical guarantees of Apple’s implementation of local differential privacy in iOS and macOS. They propose a new type of attacks, called pool inference attacks, where an adversary has access to a user’s obfuscated data, defines pools of objects, and exploits the user’s polarized behavior in multiple data collections to infer the user’s preferred pool. The results show that pool inference attacks are a concern for data protected by local differential privacy mechanisms with a large ε — such as Apple’s Count Mean Sketch mechanism —, emphasizing the need for additional technical safeguards and the need for more research on how to apply local differential privacy for multiple collections.




New paper “Interaction data are identifiable even across long periods of time” published in Nature Communications

Jan 25, 2022

A new Nature Communications paper by Ana-Maria Crețu, Federico Monti, Stefano Marrone, Xiaowen Dong, Michael Bronstein, and Yves-Alexandre de Montjoye reveals that data about people’s interactions can be used to identify individuals in anonymous datasets. The paper shows that the learned profiles are stable and that people’s behavior is still identifiable over a long period of time. The results provide strong evidence that disconnected and even re-pseudonymized interaction data can be linked together making them personal data under the European Union’s General Data Protection Regulation (GDPR).





New USENIX Security paper: “Adversarial Detection Avoidance Attacks: Evaluating the robustness of perceptual hashing-based client-side scanning”

Sep 28, 2021

In their new paper due to appear at USENIX Security 2022, Shubham Jain, Ana-Maria Crețu, and Yves-Alexandre de Montjoye showed perceptual hashing-based client-side scanning mechanisms to be highly vulnerable to detection avoidance attacks. The paper proposes a general black-box attack and demonstrates that >99.9% of images can be successfully modified while preserving the image content.




“The risk of re-identification remains high even in country-scale location datasets” published in Cell Patterns

Mar 12, 2021

“The risk of re-identification remains high even in country-scale location datasets” by Ali Farzanehfar, Florimond Houssiau, and Yves-Alexandre de Montjoye appeared today in Cell Pattern. The paper measures, mathematically models, and provides a lower bound on the relationship between the size of a dataset and the risk of re-identification as measured by unicity. The results show that the risk of re-identification decreases very slowly with increasing dataset size, contradicting previous claims.



Evaluating COVID-19 contact tracing apps? Here are 8 privacy questions we think you should ask.

Apr 2, 2020

While governments are ramping up their efforts to slow down the spread of COVID-19, contact tracing apps are being developed to record interactions and warn users if one of their contacts is later diagnosed positive. These apps could help avoid long-term confinement, but also record fine-grained location or close-proximity data. In this blog post, we propose 8 questions one should ask to understand how protective of privacy an app is.











When the signal is in the noise: Exploiting Aircloak's Diffix anonymization mechanism

Apr 24, 2018

We studied Diffix, a system developed and commercialized by Aircloak to anonymise data by adding noise to SQL queries sent by analysts. In a manuscript we just published on arXiv, we show that Diffix is vulnerable to a noise-exploitation attack. In short, our attack uses the noise added by Diffix to infer people’s private information with high accuracy. We share Diffix’s creators opinion that it is time to take a fresh look at building practical anonymization systems.



Solving AI's Privacy Problem

Feb 16, 2018

Artificial Intelligence (AI) has potential to fundamentally change the way we work, live, and interact. There is however, no general AI out there and the accuracy of current machine learning models largely depend on the data on which they have been trained. For the coming decades, the development of AI will depend on access to ever larger and richer medical and behavioral datasets. We now have strong evidence that the tool we have used historically to find a balance between using the data in aggregate and protecting people’s privacy, de-identification, does not scale to big data datasets. The development and deployment of modern privacy-enhancing technologies (PET), allowing data controllers to make data available in a safe and transparent way, will be key to unlocking the great potential of AI.