Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- GANs are used for anonymization of human figures
- DeepPrivacy2 is a novel anonymization framework for realistic anonymization of human figures and faces
- A new large and diverse dataset for human figure synthesis is introduced
- A style-based GAN is proposed to produce high quality, diverse and editable anonymizations
- Full-body anonymization framework provides stronger privacy guarantees than previously proposed methods
Paper Content
Introduction
- Collecting and storing images is common
- Collecting privacy-sensitive data without anonymization or consent is troublesome
- Traditional image anonymization distorts data
- Realistic anonymization has been introduced as an alternative
- Current methods focus on face anonymization
- SG-GAN proposes full-body anonymization
- SG-GAN has limited visual quality and insufficient anonymization
- This work extends SG-GAN to address these issues
- Images are annotated with keypoints, pixel-to-vertex correspondences and a segmentation mask
- A novel anonymization framework is proposed
- GAN generates high-quality and diverse identities
- GAN is extended for face anonymization
- DeepPrivacy2 surpasses all previous state-of-the-art realistic anonymization methods
Related work
- Naive image anonymization methods are widely used but degrade image quality
- Early work focused on K-same family of algorithms which provide better privacy and data usability
- Recent work on deep generative models can realistically anonymize data while retaining usability
- Inpainting-based methods provide stronger privacy guarantees than transformative methods
- Prior work focuses on face anonymization, leaving primary and secondary identifiers untouched
- Limited amount of work focusing on full-body anonymization with low-resolution images or visual artifacts
- Recent work on full-body synthesis focuses on limited tasks such as transferring source appearances
- Ma et al. proposed a pose-guided GAN for novel full-body synthesis without a source appearance
The flickr diverse humans dataset
- FDH dataset consists of 1.53M images of human figures from YFCC100M
- Each image contains a single human figure with pixel-wise dense pose estimation, 17 keypoint annotations, and a segmentation mask
- Dataset is automatically filtered and split into 1,524,845 images for training and 30K images for validation
- FDH is much larger and contains a diverse set of individuals from in-the-wild images
The deepprivacy2 anonymization pipeline
- Ensemble detection pipeline used
- GAN-based synthesis method used
Detection
- Main objective of detection module is to detect all individuals in image
- Uses 3 detection networks from different modalities
- Categorizes detections into 3 categories
- Anonymization methods proposed for each category
- Combines instance segmentation and CSE segmentation via IoU thresholding
- Detections tracked with Kalman filtering
- Uses Mask R-CNN, CSE, and DSFD implementations from detectron2
Synthesis method
- DeepPrivacy2 uses three independently trained generators for three different detection categories.
- Each generator frames the anonymization task as an image inpainting task.
- Generator is a U-Net with limited task-specific modeling choices.
- Full-body generator is trained on the FDH dataset at a resolution of 288 × 160.
- Face generator is trained on an updated version of the FDF dataset with a resolution of 256 × 256.
- Generator does not use keypoints for face synthesis.
Recursive stitching
- Final stage of pipeline is pasting anonymized identities into original image
- Detection overlaps can generate visually annoying artifacts
- Stitching approach stitches each individual in ascending order depending on number of pixels covered
- Synthesis method handles overlapping artifacts when generating each individual
- Ordering assumes foreground objects cover larger area, stitched in last
Experimental evaluation
- Proposed anonymization pipeline is validated in terms of synthesis quality, using anonymized data for future development, and anonymization guarantees
- Compared against traditional anonymization techniques and DeepPrivacy
- Compared full-body generator to Surface Guided GANs
- Models trained with Pytorch 1.10 on 4 NVIDIA V100-32GB
- Evaluated image quality using Torch Fidelity
- Generators have 43M parameters each
- Runtime analysis: DeepPrivacy2 architecture is computationally efficient, entire pipeline requires ∼ 2.8 seconds to process an image with 12 persons on an NVIDIA 1080 8GB GPU
Synthesis quality
- Generates high-quality human figures that transition into original image
- Handles a variety of backgrounds, poses, and overlapping objects
- CSE guidance necessary for high-quality anonymization
- Generator overfits early when trained on small COCO-Body dataset
- CSE-guided generator improves over unconditional generator
- Generates higher resolution images than DeepPrivacy
- Generates higher quality faces and handles overlaps better than DeepPrivacy
- Enables controllable anonymization through text prompts
Anonymization evaluation
- Evaluating DeepPrivacy2’s anonymization guarantee by testing how well automatic re-identification tools can identify anonymized individuals
- Pixelation enables re-identification of anonymized individuals, while DeepPrivacy2 significantly improves over traditional methods
- Evaluating DeepPrivacy2 on two established computer vision benchmarks: COCO person keypoint estimation and Cityscapes instance segmentation
- Removing CSE-guidance severely hurts performance, especially when using anonymized data for training
Conclusion
- DeepPrivacy2 is an automatic realistic anonymization framework for human figures and faces
- It is a practical tool for anonymization without degrading the image quality
- Compared to previously proposed anonymization frameworks, DeepPrivacy2 substantially improves image quality and privacy guarantees
- FDH dataset is a large-scale full-body synthesis dataset with a wide variety of identities in different poses and contexts
- Simple style-based GAN improves image quality and diversity of human figure synthesis for in-the-wild images
- Simple style-based GAN generates high-quality human faces that are controllable through user-guided anonymization via text prompts
- Open-source framework is a useful tool for computer vision researchers and other entities requiring anonymization while retaining image quality
- Framework simplifies the collection of privacy-sensitive data while retaining the original image quality
- Potential for misuse (e.g. DeepFakes)
- DeepFake Detection Challenge and model watermarking to mitigate misuse
- Generator samples from a small subset of different identities given the condition
- Input condition is highly descriptive of the shape of the original identity and the context that the identity should fit into
- Set of detectors from different modalities to improve detection in cases where one or more of the detectors fail
- Dense pose description for anonymization, which allows identity recognition through gait
- Struggles in several scenarios (e.g. low resolution images, grayscale images, automatic image quality assessment, CSE vertices per part, Mask R-CNN and CSE IoU, keypoint-to-CSE correspondence)
- FDH dataset consists of 1.5M images of human figures
- FDF256 dataset consists of 242,031 training images and 6533 validation images
- Adam optimizer with batch size of 32 and learning rate of 0.002
- No data augmentation except random horizontal flip
- Confidence threshold of 0.1 for Mask R-CNN, 0.5 for the face detector and 0.3 for the CSE detector
- 45% of detections are detected with CSE, 52% are detected by Mask R-CNN, and the remaining by DSFD
- 10% of test images have no individual detected
- Market1501 anonymization with mAP/R1 metric calculated from all test examples
- Randomly generated images from Cityscapes, COCO, FDH, and FDF256 datasets