Google Is Using AI to Fill a Flood Risk Data Gap

Flash floods, when stormwater pools and rises rapidly in an area within just a few hours of a storm's onset, are one of the more dangerous hazards of a warming planet prone to heavier rainfall. They are also notoriously difficult to predict. But research out of Google on Thursday shows how artificial intelligence could unlock better forecasts and help communities prepare.
Google researchers used Gemini, the tech giant’s signature AI agent, to process millions of news articles from around the world about past floods and extract data on when and where the deluges occurred. After assembling this vast new dataset — the largest of its kind to date — they used it to train a flood prediction model that uses local, hourly meteorological data to produce 24-hour forecasts for urban flash floods in more than 150 countries.
The dataset, which Google has named Groundsource, is free for anyone to download and use, and the forecasts are now live on Google’s Flood Hub, an online portal that also predicts river-related flood events. The tool is somewhat crude — it simply indicates whether there is a medium or high likelihood of a flash flood occurring in the next 24 hours in a given area. It only covers urban areas, and it doesn’t tell you how severe the flood could be. The resolution is also pretty coarse, indicating risks at the scale of a city rather than a street or neighborhood.
Still, the researchers said the forecasts would be useful for alerting authorities to potential risks.
“People have been very interested, even at that level of granularity,” Gila Loike, a product manager at Google Research, told reporters in a press conference this week.
According to Google, a regional disaster authority in Southern Africa caught a flash flood alert while the tool was still in beta, confirmed the flood on the ground, and then deployed a humanitarian worker to oversee the response. “We’re still in the early days of seeing the impact of Groundsource, but that chain of events from a prediction in Flood Hub to boots on the ground is exactly what Flood Hub was built for,” Juliet Rothenberg, the product director for Google’s crisis resilience work, said.
One of the key reasons it’s so hard to predict flash floods is the lack of historical data. We have decent flood models for “riverine” flooding, when rivers overflow, because of physical gauges in rivers around the world that have collected water levels for decades, but there’s no equivalent for city streets.
News articles present a largely untapped source to fill this gap. The challenge is that the key bits of information, such as where and when the flood occurred, are buried in narrative texts and expressed in wildly inconsistent formats. It would take human experts untold hours and resources to wade through each one and record the data in a standardized manner. An AI agent such as Gemini, however, can do it much faster.
Google’s research team started out by crawling the web for news articles describing flood events going back to the year 2000, gathering an initial pool of more than 9 million stories from around the world. After getting rid of ads and menus and the like and translating the articles that were in other languages to English, they fed them to Gemini.
“You are a meticulous flood event analyst,” the researchers told the AI agent. The rest of the elaborate prompt is included in a non-peer-reviewed preprint paper detailing the group’s methods for producing the dataset. In essence, they goaded Gemini to take a sentence such as “Main Street flooded on Tuesday,” and interpret where, exactly, this Main Street was located, and which Tuesday the article was referring to.
The resulting dataset contains 2.6 million historical flood events across more than 150 countries. As a comparison, the next largest public dataset, the National Oceanic and Atmospheric Administration’s Storm Events database, contains about 2 million storm events from 1950 to the present, only about 230,000 of which are flood events. The biggest global dataset, the United Nations Office for Disaster Risk Reduction’s DesInventar system, contains 500,000 events, only a fraction of which are records of floods. It’s also restricted to participating nations and inconsistently updated.
“Oftentimes, the first question our researchers will ask when we talk about going into a new domain within crisis resilience is, what data do you have? How many data entries do you have?” Rothenberg said. “That’s what really unlocks the ability to make breakthroughs here.”
Humberto Vergara, an assistant professor of civil and environmental engineering at the University of Iowa who studies flash floods, agreed that the lack of flood observation data has been a significant obstacle for the field. He told me the Groundsource dataset will “definitely be of great interest” and that there is “definitely great need for things like this.” Using news reports to fill out the global picture of flooding is something researchers have been thinking about doing for a while, he added.
While Vergara was cautiously optimistic the data would be useful, he was quick to note that it would take additional efforts to validate. His lab is working on its own dataset based on satellite estimates of rainfall that could be used to prove out Google’s records, he said.
The Google team already made some efforts to validate Groundsource, cross-checking it with manual annotations of the news reports as well as with other existing databases. It found that about 82% of the events were labeled with the correct location and timeframe. “From a research perspective, using an 82% accurate dataset is actually acceptable,” Loike said. “A well-trained model can smooth out the inconsistencies and thereby learn the dominant patterns while ignoring the 18% of labeling errors.”
They also validated the Flood Hub predictions by comparing its U.S. outputs to flood and flash flood warnings produced by the National Weather Service. “Achieving performance metrics comparable to such a sophisticated, instrumentation-rich framework demonstrates how AI can bridge the warning gap in underserved regions that lack equivalent infrastructure,” the researchers wrote in a second non-peer-reviewed preprint describing the model development.
Part of the reason Vergara was cautious in praising the effort is that predicting flash floods is challenging for reasons beyond the lack of historical data. “Most of the driving force is rainfall,” he said. “Everybody in the community knows that predicting rainfall is extremely difficult. The best models out there cannot predict rainfall with the accuracy that is needed for flash floods with more than one or two hours of lead time.”
The utility of Google’s Flood Hub depends on who will be consuming the information, he said. It’s probably not high-resolution enough to be useful for emergency responders, but there might be agencies at the city or regional level that can use it as a situational awareness tool.
Rothenberg, of Google, is optimistic that this same method can produce useful predictions for other kinds of extreme events.
“Applying this methodology to flash flood reports is just the beginning,” Juliet Rothenberg, the product director for Google’s crisis resilience work, told reporters at the press conference. “We think there’s an immense opportunity in thinking about how we could use publicly available information to help predict heat waves or landslides, for example — other events that are hard to predict because the data hasn’t been centralized or it doesn’t exist.”
