Generating vague geographic information through data mining of passive web data

Brindley, Paul (2016) Generating vague geographic information through data mining of passive web data. PhD thesis, University of Nottingham.

[img] PDF (Thesis - as examined) - Repository staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (23MB)


Vagueness is an inherent property of geographic data. This thesis develops a geocomputational method that demonstrates that vague information has the potential to be incorporated within GIS in straightforward manner. This method applies vagueness to the elements of place: types, names and spatial boundaries, generating vague geographic objects by extracting and filtering the differing opinions and perceptions held within web derived data. The aim of the research is threefold: (1) to investigate an approach to automatically generate vague, probabilistic geographical information concerning place by mining differing perspectives from passive web data; (2) to assure the quality of the vague information produced and test the hypothesis that its results are indistinguishable from directly surveying public opinion; and (3) to demonstrate the value of integrating vague information into geospatial applications via examples of its use.

To achieve the first aim, the thesis develops methods to extract differing perspectives of place from web data - constructing (i) vague place type settlement classification and (ii) vague place names and boundaries for ‘neighbourhood’ level units. The methods developed are automated, suitable for generating output at a national scale and use a wide range of different source data to collect the differing opinions.

The second aim assesses the quality of the data produced, determining if output extracted from the web was representative of that obtained from asking people directly. Statistical analysis of regression models demonstrates that data were representative of that collected through asking people directly both for vague settlement classifications and vague urban locale boundaries. Importantly, the validation data, drawn from public opinion, also supported the notion that vagueness was omnipresent within geographic information concerning place.

The third aim was addressed through the use of case studies in order to demonstrate the added value of such data and subsequent integration of vague geographic objects within other socio-economic data. Critically, the incorporation of vagueness within place models not only add value to geographic data but also improve the accuracy of real-world representations within GIS.

Item Type: Thesis (University of Nottingham only) (PhD)
Supervisors: Wilson, Max L.
Goulding, James
Keywords: data mining, geographic inforamtion systems, gis, passive web data, vagueness
Subjects: Q Science > QA Mathematics > QA 75 Electronic computers. Computer science
Faculties/Schools: UK Campuses > Faculty of Science > School of Computer Science
Item ID: 33722
Depositing User: Brindley, Paul
Date Deposited: 20 Jul 2016 13:44
Last Modified: 08 May 2020 08:05

Actions (Archive Staff Only)

Edit View Edit View