DOI: https://doi.org/10.15368/theses.2021.140
Available at: https://digitalcommons.calpoly.edu/theses/2434
Date of Award
9-2021
Degree Name
MS in Industrial Engineering
Department/Program
Industrial and Manufacturing Engineering
College
College of Engineering
Advisor
Tali Freed
Advisor Department
Industrial and Manufacturing Engineering
Advisor College
College of Engineering
Abstract
According to the 2020 poverty estimates from the World Bank, it is estimated that 9.1% - 9.4% of the global population lived on less than $1.90 per day. It is estimated that the Covid-19 pandemic further aggravated the issue by pushing more than 1% of the global population below the international poverty line of $1.90 per day (WorldBank, 2020). To provide help and formulate effective measures, poverty needs to be located as exact as possible. For this purpose, it was investigated whether regression methods with aggregated remote-sensing data could be used to estimate poverty in Africa. Therefore, five distinct regression frameworks were compared regarding their R2 value and the mean absolute relative percentage error when estimating poverty from aggregated remote-sensing data in continental Africa. A total of 12 regression models were developed at the three poverty rates at the $1.90, $3.20, and $5.50 income level per day and can be divided into direct models, two-step models, and ensemble models. It was found that ensemble methods perform better than simpler models, with an R2 value of 0.74 for the ensemble neural net and 0.80 for the ensemble xgboost model. The best performing one step model is the kernel ridge regression with an R2 of 0.72, while the remaining frameworks of this type all perform worse. Bayesian ridge regression models consistently performed the worst compared to the other frameworks under investigation. It was found that it the model estimations were most stable at the daily income level of $1.90 and $3.20, which can be explained by the increasingly skewed distribution of target values for higher poverty thresholds. Overall, it was found that xgboost, kernel ridge regression and artificial neural networks perform better than the other models.