February 2022

Data for Decision-Making

Water and Sanitation in Low-Resource Settings

BACKGROUND

Sustainable access to safe and equitable water, sanitation, and hygiene (WASH) is a basic human need that remains unmet in numerous locations. Nations around the world strive to close this gap, at present under the banner of the United Nations Sustainable Development Goal 6. Achieving universal, adequate, accessible, and equitable WASH coverage requires the concerted efforts of professionals from national and local governments, development agencies, civil society organizations, private companies, and research institutions, as well as citizens and community organizers.

A further imperative to advance WASH development emerged during a global crisis, the COVID-19 pandemic. The accompanying global shutdown, which limited traditional, field-based WASH data collection and monitoring, has focused increasing attention on emerging data sources and analytics that could be better leveraged to support WASH improvement efforts. Over the long run, actions to consolidate WASH information resources, reduce one-time use of datasets, and leverage a broader range of data sources (including those previously considered unrelated) will have powerful implications. Accompanying advances in artificial intelligence (AI) analysis methods will also increase capabilities for learning and responding to critical WASH needs.

The goal of this report is to coalesce knowledge about how WASH stakeholders view emerging trends in data science. It represents a planning effort to align data science advances with the most potent WASH needs and demands. Analyzing how various professionals contribute to or could use data science illustrates points of potential engagement that could lead to clearer partnerships and reduce duplicative or ineffective efforts. In cases of severe data paucity, data science activities could be prioritized to offer a baseline for movement toward better-informed decision-making.

APPROACH

More than 65 decision-makers were invited to participate in this research, representing a broad cross-section of WASH organizations. Researchers administered semi-structured interview questions during phone or video calls between March and June 2020. The interview guide included both general questions for all interviewees and specific questions regarding predetermined data science “use case” hypotheses, tailored as applicable to the decision-makers’ professional organizations or roles. Common information needs reported across decision-makers and their organizations were then clustered by topic. Researchers pooled information from multiple interviews as well as related literature to assess and define the characteristics of nine specific data science use cases spanning water, sanitation, communities, programming, finance, and health.

 

 

 

 

 

 

 

 

SUMMARY OF USE CASES

Nine use cases were described in depth as pertinent topical examples of using data
science to address WASH needs, with complete detail in Annex 1 of this report. These
included:

1. Forecasting groundwater quality and quantity— Groundwater supplies are critical to meeting water demand, yet data on their quantity and quality remain hard to come by. Platforms that encourage data access and sharing across political boundaries would help to predict and forestall water supply shortcomings.

2. Reducing non-revenue water (NRW) — Treated water is lost at a high rate in many locations due to both natural and social causes, reducing compensation to water suppliers and straining environmental resources. Addressing this issue through technologies such as remote sensing and telemetry sensors could enhance water service efficiency.

3. Coordinating fecal sludge emptying — Pit latrine and septic tank emptying often takes place ad hoc, leaving pits overflowing, homeowners frustrated, and service providers without work. Coordinating these services using a central application and sensor-equipped vacuum trucks could better align the needs of workers, customers, and regulators.

4. Understanding sanitation costs — Sanitation planning at a city level often introduces excess complexity and ignores the hidden costs of fecal sludge treatment and disposal. Considering the entire sanitation value chain, newer costing applications could use local pricing information to optimize a blend of appropriate options.

5. Anticipating waterborne disease outbreaks — Retrospective disease surveillance leaves little response time for public health managers to plan or modify prevention and mitigation efforts. Risk mapping and forecasting tools might use algorithms to put decisionmakers a step ahead.

6. Interpolating household data — Achievement of global WASH goals relies on household-level access, but descriptive household data are time-consuming to gather and not uniformly available. Advanced data interpolation techniques could be applied to use fewer survey points to generate high-resolution maps and summary statistics.

7. Understanding local contexts through community classification — Tailoring WASH interventions to local community context is both critical to successful programming and notoriously challenging at large scales. Leveraging and combining existing data offers a powerful means to better customize intervention planning.

8. Targeting the poor and vulnerable — Using a single indicator such as annual gross income to qualify household for WASH subsidies may extensively misjudge poverty levels and creditworthiness. Brief, multi-question “smart” surveys offer a pathway to more accurately target financial support using alternative wealth indicators.

9. Evaluating impacts — WASH monitoring and evaluation often falls prey to negative evaluation data at or near the end of projects, when it is too late to respond. Improved, real-time processing of interim or passive data could yield valuable insights to guide investments and clarify success factors.

RECOMMENDATION

Anticipate frequent, albeit often indirect, data use for decision-making. Some professionals disregard the role of data in decision-making simply because it is ancillary or poorly documented. Research shows most data use builds incrementally on existing knowledge and is considered by decision-makers alongside other factors such as social values, costs, and competing interests. In some cases, limited data leads to inaction. Regular engagement between systems and people that produce data and those that use data would ensure its relevance and salience during decision-making windows, whether expected or unexpected. 

Normalize sharing to improve the cost-effectiveness of data production. Field collection of primary data may involve extensive startup, implementation, and data processing costs. Standards for data quality assurance and platforms for data sharing are becoming more common, and professional ethics around WASH data sharing are moving toward openness, even in the case of not-quite-successful initiatives. Embracing the power of large datasets for learning about past performance and projecting future performance can aid WASH intervention design at local, national, and international scales. 

Use advances in automated data recording and analytics to vastly assist, but not replace, human decision-making efforts. Social media, commercial, and satellite data collection platforms have made “big” datasets available, while AI modeling techniques have made it possible to rapidly detect trends that were previously unobservable. Harnessing these tools to support WASH decision-making offers a way to increase the breadth, resolution, and spatial extent of understanding about how individuals and environments interact with WASH services. In turn, building software applications that work in concert with human behavior (e.g., via record-keeping, reminders, or triggers) would ease the time and technical skills otherwise needed to reach conclusions. Still, these products are prone to inherent bias and must be considered through an equity lens within responsible human-guided research and operation practices. 

Expand capabilities for reframing the timing of evaluation. Analyzing historical data offers reflective learning value, but limited insight into proactive, forward-looking management. Preventive and adaptive WASH management strategies offer numerous benefits for environmental conservation, cost-efficiency, and public health, as well as improved professional satisfaction and customer service. Moving toward real-time and predictive capabilities would give decision-makers greater lead time to address rapidly emerging issues and pivot when implementation strategies need to be adjusted. For example, algorithms built using historical datasets can be used to communicate real-time WASH service performance and predict failure probability in advance. 

Embrace the crucial role of data science in WASH development. At a basic level, WASH development is about providing safe, sustainable, and affordable services to all. Data science approaches for similar development goals (e.g., economy, food, energy, conservation) are already finding their way into the daily lives of WASH professionals and stakeholders. Consciously building on these experiences and advances will ensure WASH development benefits equally from technological progress.

Read the Report || Access the Annex

 

post end icon

Join our newsletter

Quality insights, straight to your inbox.