Contributions of interested volunteers are a potentially powerful tool to perform scientific tasks such as image classification that are difficult for computers yet easy for humans. Volunteers can be motivated to take part in campaigns by “gamification,” making the work fun by dividing it into easily digestible “microtasks.” Such input may be particularly valuable for tasks such as land-cover mapping for which automated classification schemes struggle to attain high levels of accuracy. Effectively using volunteer labor requires an understanding of the difficulty of the microtasks, how well individual volunteers perform the tasks assigned to them, and why they perform certain microtasks better than others. Previous work on this topic has assumed that individual tasks are relatively uniform in their difficulty. We investigate the impact of non-uniform difficulty and the extent to which local knowledge helps users perform geographical microtasks with the goal of informing better design of games.
We assessed different measures of work quality and task difficulty in a dataset from the “Cropland Capture” game with over 4.5 million classifications of 165,000 images by about 2700 volunteers. The game has simple mechanics in which users see an image, either from satellites or ground-based photographs, and are asked whether or not it contains cropland. If unsure, they can respond “maybe.” Volunteers were occasionally given images that they had previously rated as a test of their consistency. To provide an external reference, 342 images from the game were validated by land cover classification experts.
While many methods assume that the majority vote yields the correct classification of an image, comparison of expert and volunteer classifications shows that, at least for identification of cropland, this is frequently untrue. Agreement with other volunteers and self-agreement consistently over-estimate user quality as compared with the gold standard of expert validations. Examination of image-specific rates of agreement with expert validations reveals that this problem is due to certain images that are extremely difficult for volunteers to classify correctly.
Analysis of user background shows that users perform microtasks better when the data comes from images that originate closer to their home, in agreement with widespread beliefs about geographical processes. However, the magnitude of this effect is tiny, suggesting that for remote sensing image classification, user geographical background may not be so important.
These results have been applied to the design of the new “Picture Pile” image classification game. This game incorporates direct training and feedback based on expert-validated images. We hope to soon have enough data to analyze the effectiveness of this new game in training and encouraging user participation.
Last edited: 02 March 2016
Davis KF, Yu K, Herrero M, Havlik P, Carr JA, & D’Odorico P (2015). Historical trade-offs of livestock’s environmental impacts. Environmental Research Letters 10 (12): p. 125013. DOI:10.1088/1748-9326/10/12/125013.
Wilson C & Grubler A (2015). Historical Characteristics and Scenario Analysis of Technological Change in the Energy System. In: Technology and Innovation for Sustainable Development. Eds. Vos, R. & Alarcon, D., pp. 45-80 Norwich, UK: Bloomsbury Academic. ISBN 978-1-4725-8079-510.5040/9781472580795.ch-003.
Duarte R, Feng K, Hubacek K, Sanchez-Choliz J, Sarasa C, & Sun L (2015). Modeling the carbon consequences of pro-environmental consumer behavior. Applied Energy 184: 1207-1216. DOI:10.1016/j.apenergy.2015.09.101.
International Institute for Applied Systems Analysis (IIASA)
Schlossplatz 1, A-2361 Laxenburg, Austria
Phone: (+43 2236) 807 0 Fax:(+43 2236) 71 313