Population estimation beyond counts-Inferring demographic characteristics.

Noée Szarka, Filip Biljecki
Author Information
  1. Noée Szarka: School of GeoSciences, University of Edinburgh, Edinburgh, United Kingdom.
  2. Filip Biljecki: Department of Architecture, National University of Singapore, Singapore, Singapore. ORCID

Abstract

Mapping population distribution at a fine spatial scale is essential for urban studies and planning. Numerous studies, mainly supported by geospatial and statistical methods, have focused primarily on predicting population counts. However, estimating their socio-economic characteristics beyond population counts, such as average age, income, and gender ratio, remains unattended. We enhance traditional population estimation by predicting not only the number of residents in an area, but also their demographic characteristics: average age and the proportion of seniors. By implementing and comparing different machine learning techniques (Random Forest, Support Vector Machines, and Linear Regression) in administrative areas in Singapore, we investigate the use of point of interest (POI) and real estate data for this purpose. The developed regression model predicts the average age of residents in a neighbourhood with a mean error of about 1.5 years (the range of average resident age across Singaporean districts spans approx. 14 years). The results reveal that age patterns of residents can be predicted using real estate information rather than with amenities, which is in contrast to estimating population counts. Another contribution of our work in population estimation is the use of previously unexploited POI and real estate datasets for it, such as property transactions, year of construction, and flat types (number of rooms). Advancing the domain of population estimation, this study reveals the prospects of a small set of detailed and strong predictors that might have the potential of estimating other demographic characteristics such as income.

References

  1. Sci Total Environ. 2019 Mar 25;658:936-946 [PMID: 30583188]
  2. PLoS One. 2016 Jun 02;11(6):e0156808 [PMID: 27254151]
  3. Proc Natl Acad Sci U S A. 2018 Apr 3;115(14):3529-3537 [PMID: 29555739]
  4. BMC Med. 2018 Oct 18;16(1):193 [PMID: 30333043]
  5. Nat Commun. 2019 Aug 19;10(1):3736 [PMID: 31427577]
  6. J R Soc Interface. 2015 Apr 6;12(105): [PMID: 25788540]
  7. PLoS One. 2021 Mar 26;16(3):e0249044 [PMID: 33770133]
  8. PLoS One. 2015 Feb 17;10(2):e0107042 [PMID: 25689585]
  9. PLoS One. 2014 Jul 03;9(7):e100037 [PMID: 24992657]
  10. PLoS One. 2017 Jan 9;12(1):e0166806 [PMID: 28068339]
  11. PLoS One. 2018 Feb 20;13(2):e0193013 [PMID: 29462194]
  12. Int J Geogr Inf Sci. 2018;32(10):1948-1976 [PMID: 30886533]
  13. Spat Spatiotemporal Epidemiol. 2020 Aug;34:100355 [PMID: 32807400]
  14. PLoS One. 2021 Jun 2;16(6):e0252015 [PMID: 34077441]

MeSH Term

Income
Residence Characteristics
Singapore

Word Cloud

Created with Highcharts 10.0.0populationageaverageestimationcountsestimatingcharacteristicsresidentsdemographicrealestatestudiespredictingbeyondincomenumberusePOIyearsMappingdistributionfinespatialscaleessentialurbanplanningNumerousmainlysupportedgeospatialstatisticalmethodsfocusedprimarilyHoweversocio-economicgenderratioremainsunattendedenhancetraditionalareaalsocharacteristics:proportionseniorsimplementingcomparingdifferentmachinelearningtechniquesRandomForestSupportVectorMachinesLinearRegressionadministrativeareasSingaporeinvestigatepointinterestdatapurposedevelopedregressionmodelpredictsneighbourhoodmeanerror15rangeresidentacrossSingaporeandistrictsspansapprox14resultsrevealpatternscanpredictedusinginformationratheramenitiescontrastAnothercontributionworkpreviouslyunexploiteddatasetspropertytransactionsyearconstructionflattypesroomsAdvancingdomainstudyrevealsprospectssmallsetdetailedstrongpredictorsmightpotentialPopulationcounts-Inferring

Similar Articles

Cited By