Open Data for Algorithms: Mapping Poverty in Belize Using Open Satellite Derived Features and Machine Learning
Mapping the spatial distribution of poverty and incomes within a country remains a challenge. Recently, several proposed methods incorporate features from satellite imagery to improve model performance (Babenko et al., 2017, Poverty mapping using convolutional neural networks trained on high and medium resolution satellite images, with an application in Mexico. ArXiv Preprint ArXiv:1711.06323) or supplant small area estimation methods (Jean et al., 2016, Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790–94. doi:10.1126/science.aaf7894; Engstrom et al., 2017, Poverty from space: Using high-resolution satellite imagery for estimating economic well-being.). However, these methods require high-spatial resolution imagery which, given their cost and infrequent acquisition, may render these advances impractical for most applications. We investigate how small area estimates of average income may improve when incorporating features derived from Sentinel-2 and MODIS imagery. Both satellites provide free imagery, have global coverage, and a frequent revisit rate. We estimate a poverty map for Belize which incorporates spatial and time series features derived from these sensors, with and without survey derived variables. We document an 8% percent improvement in model performance when including these satellite features. We conclude by arguing that Open Data for Development should include open data pipelines where possible.
Jonathan Hersh, Ryan Engstrom & Michael Mann (2021) Open data for algorithms: mapping poverty in Belize using open satellite derived features and machine learning, Information Technology for Development, 27:2, 263-292, https://doi.org/10.1080/02681102.2020.1811945
Taylor & Francis
This is an Accepted Manuscript of an article published in Information Technology for Development, volume 27, issue 2, in 2021, available online at https://doi.org/10.1080/02681102.2020.1811945. It may differ slightly from the final version of record.