How HydroForecast works

A look at the AI-driven model that powers our forecasts
The HydroForecast Team
Mar 31, 2025
Table of contents

With HydroForecast, we've created a groundbreaking large artificial intelligence (AI) hydrology model that learns the physical relationships governing streamflow prediction—achieving unprecedented global scale and complexity only possible with AI. Our unique approach results in the best performing forecasts available, giving customers the insights to improve business outcomes–from risk mitigation and reduced costs to improved safety preparedness and increased operational efficiency. 

With a deeper understanding of how our model works, the product’s capabilities, and how we use customer data, we hope to empower you to maximize HydroForecast’s potential for your needs.

“Upstream Tech’s theory-guided [AI] approach in creating HydroForecast is a testament to innovative thinking to solve complicated problems and responsibly generate clean electricity for millions of Americans.”
- LeRoy Coleman, Director of Communications, NHA when HydroForecast won the 2022 Outstanding Stewards of America’s Waters Award

Comparing methodologies

Building off of traditional physics-based approaches, AI is the latest technological step change to offer new opportunities to address challenges and improve performance. HydroForecast is the first-of-its-kind large AI model trained directly on real-world streamflow observations. We implemented theory-guided AI by designing a hydrologic model that can take advantage of the benefits from both traditional physics based streamflow forecasting and statistical AI machine learning models. 

The importance of our hybrid approach is evident when you look at the limitations of both traditional physics-based forecasting and standard, off-the-shelf AI models.

Traditional approach to forecasting

Physics based streamflow modeling (also sometimes called conceptual) requires knowledge of precipitation and temperature forecasts, topography, in situ measurements, among other variables–traditionally, all specific to the basin of interest.

Limitations:

  • Difficult to scale across locations because it relies on basin-specific data for calibration (and re-calibration)
  • Can’t add or experiment with new inputs like satellite imagery
  • Requires expensive and time-consuming recalibration

Not all AI is equal

Take an off-the-shelf AI model and train it on a single location to predict streamflow from weather and you might expect strong results, but use caution. This approach will be severely constrained by the historical record and inflexible to changing conditions in the basin. The first time an extreme weather event or drought comes along outside the historical record, or changes occur in the basin that affect the hydrologic response, the model will fail, potentially catastrophically.

Limitations:

  • Limited data results in limited outputs, unable to correctly predict extreme events. 
  • ‘Black box’ model makes it hard to understand what is driving the output predictions.
  • Need to start from scratch when training a model for a new basin.

A foundational, theory-guided AI hydrology model: HydroForecast

Under the umbrella of AI, HydroForecast is a theory-guided machine learning model–specifically, a type of neural network known as a Long Short Term Memory network (LSTM). That means we go beyond traditional AI models by combining the physical input variables that we already know have a direct relationship with streamflow forecasts with the compute power, scalability, and adaptability provided by machine learning. LSTMs are able to identify long term relationships, such as how snowpack translates to spring run off once temperatures warm up. Paired with physically realistic inputs, they are the perfect tool for ‘learning’ the hydrology in your specific basin.

Benefits:

  • Physics informed input features. 
  • Learns across a diversity of basins to adapt to nonstationarity and extreme events outside of the historical record.
  • Fast, scalable deployment using global datasets.
  • Works at any location in the world, including both gauged and ungauged basins.

Our approach was put to the test in a forecasting competition–and won. HydroForecast outperformed all other entrants –from physical models to other AI models–across all competition regions. While other AI models struggled to adapt to complex hydrology and shifting climate patterns, HydroForecast’s hybrid design proved more resilient and accurate.

We refer to our model as AI but more specifically, the application of AI we use is known as machine learning (ML). Read more about neural networks, LSTMs, and more in our series Decoding AI & Hydrology for Water Management Decisions.

High-level comparison of different approaches
Analogy: Cooking with substitutions and forecasting with theory-guided AI

Implementing a tried-and-true physical model in hydrology is like mastering a recipe—you can make it in your kitchen from memory without changing the ingredients. However, when applying it in different environments, like a chef adjusting for local or seasonal ingredient availability, you need a more flexible approach. 

Our theory-guided foundational model is analogous to the intuition of a trained chef. The chef can recreate a recipe with the original flavor profile while allowing for substitutions because they understand which ingredients and methods go together well to produce a great dish. Similarly, we fine-tune our model with local observations to ensure more accurate results for each specific basin.

How HydroForecast works

Start with the best available data inputs

HydroForecast combines dynamic global meteorological data, satellite surface observations, and static geospatial data sets like land class type and slope, alongside local in situ measurements. Where there are high-resolution datasets available that cover specific regions, we work to integrate those as well (HRRR and SNODAS over North America, for example). Highly granular data drives model accuracy, and we are continuously evaluating and integrating new datasets to feed into HydroForecast. The ability to process and surface large datasets efficiently is crucial for the scalability of HydroForecast, and making this data more accessible is also a key part of our vision.

Get more information on the latest model inputs.

Create a strong foundational model with theory-guided AI

The idea behind theory-guided AI is to design a model that can take advantage of the benefits from both physical and statistical approaches. We do this by selecting input features based on known physical relationships, and then implementing an LSTM model based on its suitability for non-linear long term forecasting. 

Our foundational model is trained across a huge diversity of many hundreds of basins, representing different geographies, climates, and hydrologic systems. It has been exposed to both extremely wet years and extremely dry years during training. This means that our forecasts aren’t limited to basin-specific historical trends for model development, but can pull from similar weather events that may have occurred in a basin thousands of miles away. 

This foundational model is what allows us to create accurate forecasts in varied geographies and regions, and even produce accurate forecasts where there are no in situ observations. The model infrastructure can handle dynamic changes within a basin without recalibration, processing data and building ‘memories’ and relationships in ways that physically-based models cannot. When extreme or ‘new’ weather events occur in a particular region, the model is able to say, “Even though we are in California, I’ve seen these conditions before in Colorado! Let’s adjust the flow impact from that event to this specific basin in California.”

Site-specific tuning for each customer

We gather input from customers to ensure that our model accurately reflects the reality on the ground to capture customer specific needs. Because LSTMs excel at recognizing the complex interactions between model inputs , the foundational model is able to learn the relationships that the local  observations have with the global variables used to build the foundational model in that specific basin. We tune the foundational model for each customer with their private data to produce a model that's specific to them and their basins. This results in the highest possible accuracy while enabling foundational model traceability–we protect customers' data and they are assured that it's being used for their forecasts alone.

Depending on basin size and observation collection locations, HydroForecast decomposes basins into multiple sub-basins. This allows the model to tune each sub-basin to local weather conditions and flow  where observations are available. During this process, we work with customers to understand any routing nuances between sub-basins, as this impacts how water is directed across sub-basin lines and ultimately routed to the forecast location.

The result: accurate forecasts customized at the basin level

HydroForecast provides customers with a probabilistic forecast for a short-term or seasonal horizon. The forecasts are tuned to their sites and come with a strong foundational model that has learned the underlying relationships governing streamflow, allowing our model to be deployed anywhere in the world.

Our powerful features allow you to compare HydroForecast outputs with industry benchmarks like NOAA, analyze predictions with a range of confidence levels, review past forecasts, customize notifications to stay ahead of critical events, and much more.

Drive better business outcomes with HydroForecast’s industry-leading approach

Leveraging AI not only increases speed and accuracy, but it also allows us to continuously improve by integrating key datasets as they become available. HydroForecast provides actionable insights to help you make more confident and proactive water management decisions.

Subscribe to our newsletter to stay up-to-date on the latest improvements.