Skip to content

Reservoir Outflow Data Assimilation

shorvath-noaa edited this page Sep 18, 2024 · 8 revisions

Table of Contents

Persistence

The module currently assimilates real-time data feeds for a selected number of reservoirs from the U.S. Geological Survey (USGS) and the U.S. Army Corps of Engineers (USACE) who have flow measurement gages at the reservoir outlets or immediately downstream of the reservoirs. For the USGS-persistence approach specifically, gages within 5 km of waterbodies and with a drainage area of waterbody outlet within 95% of the drainage area of gage were selected. This module replaces the normal Level Pool (LP) routing for a particular reservoir that is selected to run as USGS-persistence or USACE-persistence reservoir object.

For use in the National Water Model, all persistence-type reservoirs will utilize the persistence (P) method for both the Standard and Extended Analysis and Assimilation (AnA) cycles/simulations. In forecast simulations, for each reservoir that is designated to run P, the observed discharge will be persisted for the full length of the forecast cycle. Based upon hydrological evaluations, each reservoir is designated to use P in either the Short-Range Forecasts (18 hours) only, Medium-Range Forecasts (10 days) only, or both types of forecasts. A reservoir will run level pool for the forecast mode that it is not designated to run P. The gage flow is persisted, but if the mass balance indicates waterbody storage will be exceeded, then the persisted flow will be adjusted to allow for higher releases. In addition, If waterbody storage would reach zero or below, the available water received from inflow is used to alter the release (i.e., outflow = inflow).

RFC

General Purpose and Functionality

The purpose of this module is to ingest RFCs’ (River Forecast Centers) observed and forecasted reservoir releases into the Reservoir module in real-time. This utilizes the expertise that each RFC has for forecasting reservoirs in their domain. This module replaces the LP object for a particular reservoir that is selected to run as RFC Forecast.

Preprocessing RFC Observations and Forecasts

Each RFC will typically provide a time series feed of observations and forecasts for a particular set of reservoirs at least once per day. The observations are usually up to 48 hours in the past and up to 10 days in the future in relation to a T0 issue time. A preprocessing program will ingest this data feed and combine each reservoir’s observations and forecasts into a contiguous 12-day minimum time series array, consisting of a minimum of 2 days of observation values and a minimum of 10 days of forecast values. The majority of the time series arrays will be 12 days, though some RFCs may provide up to 3 days of observations or around 7 hours beyond 10 days of forecasts and one RFC will provide up to 15 days of forecasts. A maximum of 10 days will be assimilated into t-route.

Most of the RFCs provide observed and forecasted discharges at a 6-hour temporal resolution, though three RFCs provide the reservoir discharges at an hourly temporal resolution. The preprocessing can be divided into two steps:

  1. RFC Observations: The preprocessing program will convert all observed reservoir discharges to an hourly temporal resolution using linear interpolation between 2 observed discharges or persisting an observed discharge backwards in time up to 48 hours if there is not an observed discharge prior to that point. If the gap between two observed discharges is greater than 48 hours, then the values would not be interpolated, and sentinel values of -999 would be inserted instead. However, this case should never happen because the maximum required length of the observed portion is 48 hours and if the T0 observed value is provided, it will be persisted up to 48 hours in reverse.
  2. RFC forecasts: An observed discharge at T0 could be persisted up to 24 hours forward and into the forecasted range to fill in missing forecast values. Forecasted discharges at 6-hr time resolutions will be converted to an hourly temporal resolution by only persisting the discharges forwards in time and not using interpolation between values. The observed discharge at T0 or any forecast discharge could be persisted forward for the full forecast cycle.

If any location in the discharge array falls outside of the bounds of what can be persisted or interpolated for the allotted 2 days of observations and 10 days of forecasts, then a sentinel value of -999 would be used in its place. For example, if no previous 48 hours of observed discharges are provided, but a T0 discharge with forecasted discharges are provided, the discharge at T0 could be persisted up to 48 hours in reverse. Any time series file with even one sentinel value will be skipped and will not be utilized for this module.

The preprocessing program creates a boolean array of equal size to each discharge array that indicates whether or not its corresponding discharge is a synthetic value. Any directly sampled discharge matching in time from the original raw data is not synthetic, and any interpolated or persisted discharge without a matching value in time from the original raw data is synthetic. Below are diagrams to demonstrate these preprocessing methods.

RFC methods for filling in missing hourly array values
RFC persisting an obs to the past
RFC persisting forecast discharges forward

Finally, the preprocessing program creates a unique time series NetCDF file (e.g., 2020-02-26_12.60min.BSGA4.RFCTimeSeries.ncdf) with the RFC Reservoir ID and the T0 issue time in the filename for each reservoir. This program runs and produces new NetCDF files whenever an RFC uploads a new set of observations and forecasts.

t-route Processing of RFC Observations and Forecasts

At the initialization of each reservoir, the software will search for an RFC TimeSeries file (i.e., time series of combined RFC observed and forecasted discharges) that corresponds to its RFC reservoir ID and a given time window. This time window is determined by the model start time and user defined parameters in the configuration file. In the case of NWM operations, the time window is determined by whether the simulation is an Extended Analysis and Assimilation (AnA), Standard AnA, or either a Short or Medium-Range Forecast.

Since an Extended AnA is set to run 28 hours of model time, the operational NWM will set the model start time 28 hours in the past in order to produce the necessary restart file one time per day. It starts at 12Z the previous day to produce a 16Z restart file for 19z Standard ANA the current day. Since every RFC normally issues new sets of observations and forecasts at least once every 24 hours, the issue time of the RFC time series file would normally fall somewhere between the model start time of the Extended AnA and the current wall time. Therefore, the software would need to start searching for an RFC time series file with its issue time 28 hours in the future of its model start time and then work backwards searching for the file with an issue time every hour until it finds the most recent one. The software would search for the file a maximum of 28 hours back from its first search time.

A Standard AnA is set to run 3 hours of model time, and the model start time is set 3 hours in the past. The Standard AnA would, therefore, need to start searching for an RFC time series file with its issue time 3 hours in the future of its model start time, the rfc_timeslice_offset, and follow the same process looking back 28 hours from that 3 hour offset. Since most of the RFCs upload observation and forecast data at least once per day, the T0 issue time would most often fall before the model start time for a Standard AnA, in which case forecasted values beyond T0 would be read and output during the Standard AnA. For example, if an RFC issues data at 0Z and a Standard AnA is set to have a 9Z model start time, the NWM would retrieve and output the 9Z through 12Z forecast values.

Short and Medium Range Forecast simulations have a model start time near the current wall time. Therefore, these simulations would start searching for the time series file with an issue time matching the model start time and work backwards up to 28 hours from there.

If a reservoir’s corresponding time series file is not found within the 28-hour time window, all discharges in the RFC time series NetCDF file are synthetic, or any observed or forecasted discharges is negative to include checking for sentinel values of -999, then LP would instead be set to run for the entire simulation. A warning would be output that states the reservoir would use the LP instead and indicates which of the above cases occurred. If the discharge array is acceptable, then the reservoir will output the appropriate values at the matching times for each timestep.

The configuration file contains an optional parameter (rfc_forecast_persist) that contains an integer value for the maximum number of days that an RFC-supplied forecast will be used/persisted in a simulation. The reservoir will assimilate and output RFC forecast values from the simulation start time until the end of this number of days. If the end of the assimilated time series array is reached before the end of this number of days, then the last value of the array will be persisted up through this number of days. The current value for this for every RFC type reservoir is 11 days for every configuration of the operational NWM. Since no configuration of the operational NWM that uses RFC type reservoirs exceeds 10 days, this 11 day value will not be exceeded.

Great Lakes

The LP method has been found to be insufficient at calculating outflows for the Great Lakes due to their size and complexity. A novel DA method utilizing USGS & Canadian gages, forecasts from the International Lake Ontario-St. Lawrence River Board, and climatological values has been implemented in t-route to provide outflows for Lakes Superior, Michigan-Huron (treated as a single waterbody), Erie, and Ontario. The following observational sources are used:

  • Lake Superior uses USGS gage 04127885
  • Lake Michigan/Huron uses USGS gage 04159130
  • Lake Erie uses Canadian gage 02HA013
  • Lake Ontario uses forecasted flows from the International Lake Ontario-St. Lawrence River Board

The DA method is similar to Persistence in that any good observation will be persisted for a period of time (defaults to 11 days) if no new observations are available. When the persistence time has been exceeded, or there are simply no valid observations to assimilate (DA module checks for new observations every hour of simulation time), monthly climatological outflows (obtained from USACE) will be used.

Clone this wiki locally