Evani Radiya-Dixit, Sanghyun Hong, Nicholas Carlini and Florian Tramèr
International Conference on Learning Representations (ICLR) 2022
Previously presented at ICML 2021 Workshop on Adversarial Machine Learning (AdvML) (Oral presentation)
Data poisoning has been proposed as a compelling defense against facial recognition models trained on Web-scraped pictures. Users can perturb images they post online, so that models will misclassify future (unperturbed) pictures.
We demonstrate that this strategy provides a false sense of security, as it ignores an inherent asymmetry between the parties: users’ pictures are perturbed once and for all before being published (at which point they are scraped) and must thereafter fool all future models—including models trained adaptively against the users’ past attacks, or models that use new technologies discovered after the attack.
We evaluate two systems for poisoning attacks against large-scale facial recognition, Fawkes (500,000+ downloads) and LowKey. We demonstrate how an "oblivious" model trainer can simply wait for future developments in computer vision to nullify the protection of pictures collected in the past. We further show that an adversary with black-box access to the attack can (i) train a robust model that resists the perturbations of collected pictures and (ii) detect poisoned pictures uploaded online.
We caution that facial recognition poisoning will not admit an "arms race" between attackers and defenders. Once perturbed pictures are scraped, the attack cannot be changed so any future successful defense irrevocably undermines users’ privacy.
@inproceedings{RHCT22, | |||
author | = | {Radiya-Dixit, Evani and Hong, Sanghyun and Carlini, Nicholas and Tram{\`e}r, Florian}, | |
title | = | {Data Poisoning Won't Save You From Facial Recognition}, | |
booktitle | = | {International Conference on Learning Representations (ICLR)}, | |
year | = | {2022}, | |
howpublished | = | {arXiv preprint arXiv:2106.14851}, | |
url | = | {https://arxiv.org/abs/2106.14851} | |
} |