Synthetic and Manipulated Overhead Imagery For Forensics Research
Brandon B. May, Kirill Trapeznikov, Shengbang Fang, Matthew C. Stamm
Comprehensive Dataset of Synthetic and Manipulated Overhead Imagery For Development and Evaluation of Forensic Tools
https://arxiv.org/abs/2305.05784
@inproceedings{may2023comprehensive,
title={Comprehensive Dataset of Synthetic and Manipulated Overhead Imagery for Development and Evaluation of Forensic Tools},
author={May, Brandon B and Trapeznikov, Kirill and Fang, Shengbang and Stamm, Matthew},
booktitle={Proceedings of the 2023 ACM Workshop on Information Hiding and Multimedia Security},
pages={145--150},
year={2023}
}
👉 ⬇️ Dataset Download
Dataset Overview
Sources
Pristine Tile Providers
City Conditioning
Inpainting Category
Inpainting Category | Region Type | Manipulated 🔄 Pristine | Mask |
---|---|---|---|
Greenspace | Bezier | ||
Buildings | Bezier | ||
Greenspace | GraphCut | ||
Buildings | GraphCut |
Inpainting Size
Inpainting Area Size | Manipulated 🔄 Pristine | Mask |
---|---|---|
Exra Small | ||
Small | ||
Medium | ||
Large |
Fully Synthetic Categories
Dataset Stats
Train Set
The train set consists of 10,000 images in total:
- 4,964 pristine (unmanipulated)
- 2,539 fully synthetic
- 2,497 partially manipulated (with manipulated regions between 1/16 and 1/4 of the image area)
- 1,260
buildings_roads
manipulation class - 1,237
greenspace_water
manipulation class
- 1,260
Each image has a corresponding binary mask image that indicates the manipulated region with the color white.
Test Set
The test set consists of 3,150 images in total:
- 1,511 pristine (unmanipulated)
- 751 fully synthetic
- 888 partially manipulated (including 150 with manipulated regions < 1/16 of the image area)
- 432
buildings_roads
manipulation class - 456
greenspace_water
manipulation class
- 432
Each image has a corresponding binary mask image that indicates the manipulated region with the color white.
Dataset Download
Metadata CSV Format
Each dataset is provided with a metadata.csv
file which contains detailed information about each image. The following columns are particularly relevant for training and testing:
image_filename
: Relative path to the image, i.e.im0000.png
mask_filename
: Relative path to the corresponding binary mask, i.e.im0000_mask.png
manipulated_fraction
: [0-1], Fraction of the image that has been manipulatedmanipulation_class
:buildings_roads
greenspace_water
- Empty for pristine and fully synthetic images
manipulation_type
:pristine
: Unmanipulatedgenerated_uncond
: Fully synthetic, no basemap usedgenerated_cond
: Fully synthetic, generated basemap usedgenerated_real
: Fully synthetic, real basemap usedinpainted_struct
: Partially manipulated,buildings_roads
inpainted_unstruct
: Partially manipulated,greenspace_water
Additional Generation Modes
Partial Manipulations
Natural Disaster Simulation
Contact
For more information: Kirill Trapeznikov, kirill.trapeznikov@str.us; Brandon May, brandonbmay@gmail.com
Acknowledgment
This material is based upon work supported by DARPA under Contract No. HR0011-20-C-0129. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA.
DISTRIBUTION A. Approved for public release: distribution unlimited.