July 2024

Smartphone App for Identifying Poor Households in Ghana

Authored by Vincent Auffray, Chloé Poulin, Karen Setty, Anna Murray, Caroline Delaire

Access to safe drinking water is still low among the poorest households in sub-Saharan Africa. Water subsidies can enhance access to safe water services but often fail to reach the poor. Aquaya developed a simple Android smartphone application to quickly identify households living in poverty. We used machine learning (feature reduction and neural network models) to reduce an already existing 47-question proxy-means test to a 14-question survey that is faster to administer. This new survey approached the performance of the 47-question version and outperformed the currently used Poverty Probability Index proxy-means test (PMT).

Low-income households in low-and middle-income countries lack access to necessary public services such as safely managed drinking water. Reducing pricing for the lowest-income customers addresses these inequities while maintaining the service’s value and affordability. Still, subsidy programs often fail to achieve these objectives in practice. Household income over time is difficult to verify, meaning customers who can afford to pay might take advantage of discounts rather than those most in need. The method used to qualify households for subsidies must accurately assess relative poverty, be accepted within the community, and be feasible to use on a large scale.

A common approach to identifying poor households (subsidy candidates) is called proxy-means testing. It relies on a questionnaire about household or individual characteristics correlated with welfare and income (such as vehicle ownership) to estimate poverty levels. Types of proxy-means tests include the Demographic and Health Survey (DHS) wealth index, Equity Tool, and the Poverty Probability Index.

In a 2022 publication, Poulin et al. introduced a machine-learning method to determine whether a household is below Ghana’s poverty line. This novel approach used machine learning to identify patterns in the Ghana Living Standard Survey dataset and select 47 appropriate survey questions out of more than 600 options for the proxy-means test. The original approach was, however, difficult to apply during visits to households because the questionnaire was lengthy and the SuperLearner model could only run on a computer. Turning it into a practical tool required reducing the number of questions and developing a smartphone application to compute model results offline in real time.

Methods for Identifying Poor Households

Aquaya developed a new model in Python using the TensorFlow Lite library. This library is compatible with Android Studio, unlike the SuperLearner package in R used in the original model (Poulin et al., 2022). We then incorporated the new model into a mobile app using Android Studio, which supports machine learning, and the open-source sample survey app Jetsurvey.

Model Acceptability

Community-based targeting is another widely trusted approach for accurately identifying poor households through consultation with large groups of community members. When applied to a test population of 818 households in the Ahafo and Ashanti regions (Poulin et al. 2022), the TensorFlow proxy-means test identified 32% as poor but missed an additional 6% who were designated as poor during a community-based targeting exercise (represented by the six yellow cells on Figure below).

This is a visual aid depicting the procedure for employing the test. It is from an analysis of an app that helps identify households living in poverty.

Operation

The app can be used by anyone who can operate an Android smartphone. To determine a household’s poverty status, the user fills in a household identification number and responds to 19 questions (14 questions for the proxy-means test and five for the community-based vulnerability criteria). The app then presents the result of the household’s poverty status (“poor” or “non-poor”) and an explanation of the status. Results can be saved as a .csv file and shared directly from the phone.

Conclusions

Proxy tests can help identify households living in poverty and better target pro-poor programming to ensure equitable environmental health outcomes. While machine learning represents a substantive advance in developing predictive models to identify vulnerable households, they should be responsibly considered alongside knowledge from the local context to avoid unintended consequences. This may mean using more than one selection approach as a safety net, although this choice is largely at the discretion of the technology adopter. The technology can be adapted to support government, donors, or service providers in numerous settings.

Could this be the best approach to identify households living in poverty?


post end icon

Join our newsletter

Quality insights, straight to your inbox.