HydroForecast outperforms the competition

Proven to be the most accurate streamflow forecast no matter how you slice it
Marshall Moutenot
Mar 23, 2025
Table of contents

The last decade has been one of hydropower transformation: on one side, the grid demands more from hydropower's flexibility; on the other, extreme weather challenges operations and safety. In response, CEATI’s Hydropower Operations and Planning Interest Group (HOPIG) organized the 2021 Streamflow Forecast Rodeo, a year-long forecast competition to assess new forecasting technologies.

HOPIG includes members of key hydropower organizations, including Hydro-Québec (H-Q), the Tennessee Valley Authority (TVA), the U.S. Bureau of Reclamation (USBR), and Southern Company. Their interest in testing AI was sparked by the potential for machine learning to enhance the accuracy and efficiency of streamflow forecasts.

When the chance came to demonstrate HydroForecast’s accuracy against traditional forecasting methods, we took it. In the end, our model outperformed the others in 23 of 25 categories across all regions spanning the US and parts of Canada. 

The competition reinforced—both for us and the CEATI HOPIG member utilities—that AI can be a massively powerful tool for hydropower operations and water management. But the competition also revealed that not all AI models are created equally, and other participants’ AI models performed worse than traditional methods. Our unique theory-guided approach is what continues to set us apart in the field.

Competition overview

Forecast Rodeo Table
Race course 19 forecast points across North America
Distance All parties submitted forecasts every morning, for 1 year
Competitors
  • Internal forecasting teams at major utilities
  • Government forecasts (River Forecast Centers, National Water Model)
  • Forecast vendors

The hydrological forecasting technologies ranged from traditional physically-based (or “conceptual”) models, off-the-shelf AI models, and our large hydrology AI approach. Forecasts were deployed at 19 sites across North America over lead times of 1 to 10 days, and performance was measured using 916 metrics across locations, different seasons, and varying flow conditions—providing a rigorous, real-world assessment. To ensure accuracy and fairness, the results were independently verified by RTI International.

The hundreds of metrics were compiled into five different categories evaluated at each region:

  1. All-arounder = overall score for all metrics
  2. Flood forecaster = score for all metrics when river is at high flow levels
  3. Quick draw = overall score for all metrics in the early part of the forecast window
  4. Eagle eye = overall score for all metrics in the late part of the forecast window
  5. Straight shooter = fewest assumptions

Results: HydroForecast named the all-around winner

HydroForecast claimed the all-around win, consistently out-performing other models across the evaluated metrics.

Why did HydroForecast perform so well?

For us, the results of the competition didn’t end with the leaderboard. We wanted to understand exactly why we outperformed the other models. 

We explored performance by model type: were there any clear patterns in which models performed well in certain hydrologic regimes/conditions? The image below shows that the theory-guided statistical model, HydroForecast, had the highest performance across all regions, whereas in the West and Southeast other approaches struggled. Additionally, we observed that the conceptual models outperformed the pure statistical models.

Especially in the West, where snow processes dominate the hydrologic regime, HydroForecast outperformed the next best model during the snowmelt season (Jan 1 - June 30) by 41%. This reduction in error is especially impactful for operators managing this critical water supply period.

Our theory-guided approach drives accuracy

HydroForecast stands out because it uses a theory-guided machine learning approach, combining the best of AI with deep meteorological and hydrology expertise. Unlike traditional machine learning models, which rely purely on data patterns, HydroForecast integrates the advantages of physical modeling, power of big data, and expert knowledge:

  • We train one large foundational model across many hundreds of basins representing a vast range of hydrologic conditions and patterns.
  • Our model is trained directly on observations and weather inputs, and we work with meteorology and hydrology experts to select the right inputs.
  • We build on physical modeling to refine training, evaluation, and improvements.

To put it simply: HydroForecast is uniquely able to predict extremes outside of a single basin’s historical data due to the foundational base model that is trained with large datasets from diverse sources and then tuned to each specific site.

Ongoing improvements

So what’s changed since the competition concluded in 2022? Our theory-guided approach is what led us to be named the most accurate on the market, but it also means that HydroForecast is continuously learning and improving. The latest generation of HydroForecast Short-term 3 (ST3) has improved overall accuracy, reliability, and interpretability. With HydroForecast outperforming the competition two versions ago, every new iteration earns us the title of most accurate streamflow forecasting solution over again.

As we continue to deploy in basins all over the world, the real-world results speak for themselves: we’re consistently providing the most accurate forecasts to integrate with our customers’ workflows.

Reach out to our team to learn how HydroForecast’s accuracy can help you optimize operations and planning.