Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover

Stehman, Stephen V., Fonte, Cidália C., Foody, Giles M. and See, Linda (2018) Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover. Remote Sensing of Environment, 212 . pp. 47-59. ISSN 0034-4257

Full text not available from this repository.


Volunteered Geographic Information (VGI) offers a potentially inexpensive source of reference data for estimating area and assessing map accuracy in the context of remote-sensing based land-cover monitoring. The quality of observations from VGI and the typical lack of an underlying probability sampling design raise concerns regarding use of VGI in widely-applied design-based statistical inference. This article focuses on the fundamental issue of sampling design used to acquire VGI. Design-based inference requires the sample data to be obtained via a probability sampling design. Options for incorporating VGI within design-based inference include: 1) directing volunteers to obtain data for locations selected by a probability sampling design; 2) treating VGI data as a “certainty stratum” and augmenting the VGI with data obtained from a probability sample; and 3) using VGI to create an auxiliary variable that is then used in a model-assisted estimator to reduce the standard error of an estimate produced from a probability sample. The latter two options can be implemented using VGI data that were obtained from a non-probability sampling design, but require additional sample data to be acquired via a probability sampling design. If the only data available are VGI obtained from a non-probability sample, properties of design-based inference that are ensured by probability sampling must be replaced by assumptions that may be difficult to verify. For example, pseudo-estimation weights can be constructed that mimic weights used in stratified sampling estimators. However, accuracy and area estimates produced using these pseudo-weights still require the VGI data to be representative of the full population, a property known as “external validity”. Because design-based inference requires a probability sampling design, directing volunteers to locations specified by a probability sampling design is the most straightforward option for use of VGI in design-based inference. Combining VGI from a non-probability sample with data from a probability sample using the certainty stratum approach or the model-assisted approach are viable alternatives that meet the conditions required for design-based inference and use the VGI data to advantage to reduce standard errors.

Item Type: Article
Keywords: Probability sampling; External validity; Pseudo-weights; Data quality; Model-based inference; Volunteered geographic information (VGI); Crowdsourcing
Schools/Departments: University of Nottingham, UK > Faculty of Social Sciences > School of Geography
Identification Number:
Depositing User: Eprints, Support
Date Deposited: 27 Apr 2018 12:13
Last Modified: 04 May 2020 19:44

Actions (Archive Staff Only)

Edit View Edit View