Psychology Faculty Articles and Research

Crowdsourcing Image Extraction and Annotation: Software Development and Case Study

Ana Jofre, SUNY PolytechnicFollow
Vincent Berardi, Chapman UniversityFollow
Kathleen P.J. Brennan, University of QueenslandFollow
Aisha Cornejo
Carl Bennett
John Harlan

Document Type

Article

Publication Date

2020

Abstract

We describe the development of web-based software that facilitates large-scale, crowdsourced image extraction and annotation within image-heavy corpora that are of interest to the digital humanities. An application of this software is then detailed and evaluated through a case study where it was deployed within Amazon Mechanical Turk to extract and annotate faces from the archives of Time magazine. Annotation labels included categories such as age, gender, and race that were subsequently used to train machine learning models. The systemization of our crowdsourced data collection and worker quality verification procedures are detailed within this case study. We outline a data verification methodology that used validation images and required only two annotations per image to produce high-fidelity data that has comparable results to methods using five annotations per image. Finally, we provide instructions for customizing our software to meet the needs for other studies, with the goal of offering this resource to researchers undertaking the analysis of objects within other image-heavy archives.

Comments

This article was originally published in Digital Humanities Quarterly, volume 14, issue 2, in 2020.

Recommended Citation

Jofre, A., Berardi, V., Brennan, K. P. J., Cornejo, A., Bennett, C., & Harlan, J. (2020). Crowdsourcing image extraction and annotation: Software development and case study. Digital Humanities Quarterly, 14(2). http://www.digitalhumanities.org/dhq/vol/14/2/000469/000469.html

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Download

Included in

Digital Humanities Commons, Software Engineering Commons

COinS

Chapman University Digital Commons

Psychology Faculty Articles and Research

Crowdsourcing Image Extraction and Annotation: Software Development and Case Study

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Creative Commons License

Included in

Browse

Search

Author Corner

Links

Chapman University Digital Commons

Psychology Faculty Articles and Research

Crowdsourcing Image Extraction and Annotation: Software Development and Case Study

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Creative Commons License

Included in

Share

Browse

Search

Author Corner

Links