Skip to content

Level 5

Peter Isaac edited this page Jun 1, 2022 · 3 revisions

Level 5 - Gap filling of fluxes

Overview

Level 5, or L5 for short, is the stage where PyFluxPro fills gaps in the turbulent fluxes. Gaps in the meteorological drivers are filled at the previous level, L4, and partitioning of the gap filled net ecosystem exchange (NEE) is done at the next level, L6. As with the other processing levels, options for the gap filling process are specified in a control file that can be edited in the PyFluxPro GUI. Templates for L5 control files are in the PyFluxPro/controlfiles/templates/L5 folder.

Three methods for gap filling fluxes are available at L5:

  1. Marginal Distribution Sampling (MDS, GapFillUsingMDS) - this implements the MDS method described in Reichstein et al (2005) using the C code used in the FluxNet processing system ONEFlux.
  2. SOLO Neural Network for short gaps (GapFillUsingSOLO) - this implements the SOLO neural network described in Hsu et al (2000) optimised for filling short (less than ~14 to 30 days) gaps in turbulent fluxes.
  3. SOLO Neural Network for long gaps (GapFillLongSOLO) - this implements the SOLO neural network described in Hsu et al (2000) optimised for filling long (more than 30 days) gaps in turbulent fluxes.

An important consideration for gap filling the turbulent fluxes is filtering out those times when there is insufficient turbulence in the layer between the eddy covariance (EC) instruments and the surface to ensure the EC measurements are representative of surface exchange. It is recommended that u* filtering of the data be applied before gap filling, see the following section.

Before we leave the overview of gap filling, it is worth restating Isaac's First Law (gap filling version):

  • It is far easier to collect good data from the start than to gap fill bad data.

Background

u* Filtering

Eddy covariance (EC) measurements are only representative of surface exchange when there is sufficient turbulence in the layer between the surface and the EC instruments to keep that layer well mixed. When this layer becomes stable, usually at night, this requirement is not met and turbulent fluxes, particularly the CO2 flux (Fco2), from these times must be rejected or subsequent calculations of ecosystem respiration (ER) will be underestimated.

A common method to detect times when Fco2 must be rejected is to filter the data based on the value of the friction velocity (u*). This approach assumes that a threshold value for u* can be found and that below this threshold there is insufficient turbulence to ensure the observations from the EC instruments are unbiased. PyFluxPro provides several methods to estimate the u* threshold, available under the Utilities menu. Each of these methods produces an Excel workbook containing the results and these workbooks can be read by PyFluxPro at L5 to provide the u* threshold estimates for filtering the data (see cpd_filename in the Files section).

An alternative approach, if an Excel workbook of results is not available, is to specify date ranges and associated u* threshold values in the (optional) ustar_threshold section.

Gap Filling Turbulent Fluxes

General

The MDS gap filling method works well for short gaps of a few days to a couple of weeks provided the ecosystem does not change significantly during that time.  The SOLO neural network, using only meteorological drivers, can work well up to a month or so and performs reasonably well for gaps of a few months if the ecosystem does not change significantly during that time.  In general, MDS is good for short gaps and SOLO (using only meteorological drivers) is good for short to medium length gaps.

Both do rather badly for long gaps of more than a few months; MDS does really badly, SOLO can be OK but not great.  However, if the ecosystem is changing quickly during the gap e.g. savanna systems at the end of the dry season, then even SOLO can perform badly.  The case that highlighted this was Dry River site in the Northern Territory, Australia which has a gap of about 5 months that spans from the end of the dry season into the wet season in 2017.  SOLO, using only meteorological drivers, does not do a good job of gap filling this period because it trains on the wet seasons either side of the dry season gap.

The problem is that the meteorological drivers used for SOLO have no information about the state of the ecosystem, only the meteorology to which the ecosystem is responding.  By training SOLO on short periods, 2 months by default, the neural network allows the relationship between the drivers and the fluxes to change from one period to the next and this can account for changes in the underlying ecosystem.  However, when filling long gaps of several months, the neural network will use the same relationship between drivers and fluxes for the whole training plus gap period.  If the ecosystem changes significantly during this period (e.g. dry to wet season transition) then the neural network will do a poor job.

One way to fix this is to give the neural network a driver that contains information about the state of the ecosystem.  The obvious choice is a remote sensing product and OzFlux uses the Enhanced Vegetation Index (EVI) from MODIS but others could be used.  OzFlux acquires the MODIS data, processes it and puts a netCDF file containing the data onto CloudStor for site PIs to use.  Now, however, instead of training the neural network on a relatively short period, the neural network is trained on a period that contains at least 1, preferably more, period where the ecosystem is in a similar state to that expected for the period of the long gap.  Using Dry River again as an example, if the site is missing a dry season then SOLO needs to be trained over a period that contains at least 1 dry season, preferably 2 or more.  In practice, training SOLO over the whole data set allows it to use several dry seasons to determine the relationship between the drivers, including EVI, and the fluxes.

So, a gap filling scheme to deal with both short and long gaps would be:

  1. Use SOLO (meteorological drivers only) to fill short gaps of up to, say, 2 weeks in length, see GapFillUsingSOLO.
  2. Use SOLO with EVI (or NDVI etc) plus meteorological drivers to fill long gaps, see GapFillLongSOLO.
  3. Merge the 2 sets of SOLO results such that short gaps are filled by 1. and long gaps are filled by 2.

This is what the long gap fill option of PyFluxPro does.  In order to flag the occurrence of long gaps, PyFluxPro now checks to see if any of the variables specified for gap filling have long gaps (14 days by default).  If they do and the long gap filling method hasn't been specified for that variable, PyFluxPro issues a warning message.  If the user chooses "Continue", PyFluxPro will divide the data set into the specified window length (2 months by default), use SOLO to gap fill those and if "Auto-complete" is chosen (the default option) it will fill long gaps by training on the data either side of the gap.

The disadvantage of the above method is that gap filling using the SOLO neural network can be slow, see the GapFillUsingSOLO section for details, but as an example, gap filling 4 variables (u*, Fh, Fe and Fco2) in a 10 year data set takes about 1 hour. This means that when both GapFillUsingSOLO and GapFillLongSOLO are used, gap filling 4 variables in a 10 year data set can take 2 hours because the neural network is run over the whole data set twice, once to fill short gaps and a second time to fill long gaps. An alternative is to use a combination of MDS to fill short gaps and SOLO to fill long gaps because MDS is very fast, taking less than 2 minutes to gap fill 3 variables (u* can not be filled using the FluxNet MDS code) in a 10 year data set.

So, a better gap filling scheme to deal with both short and long gaps may be:

  1. Use MDS (meteorological drivers only) to fill short gaps of up to, say, 2 weeks in length, see GapFillUsingMDS.
  2. Use SOLO with EVI (or NDVI etc) plus meteorological drivers to fill long gaps, see GapFillLongSOLO.
  3. Merge the 2 sets of results such that short gaps are filled by MDS and long gaps are filled by SOLO.

This combination has not been tested and, as with any gap filling procedure, it is up to the user to make sure that the results are reasonable for their site.

Choosing the Drivers for Gap Filling

The drivers for the MDS gap filling method are currently restricted to incoming shortwave radiation (Fsd), air temperature (Ta) and vapour pressure deficit (VPD). There must be no gaps in the drivers. The drivers for the SOLO gap filling method can be any variable as long as there are no gaps in the drivers. There are, however, some considerations when choosing drivers for use with the SOLO neural network.

SOLO can do a very good job of gap filling provided it is given drivers that contain information that explains a significant amount of the variance in the target variable. For example, if the target has a pronounced diurnal cycle, then at least 1 of the drivers needs to have a similar diurnal cycle. In general, it is best to start with a set of drivers that the user believes, on the basis of their knowledge of how the ecosystem works, are important contributors to the diurnal and seasonal variations in the target. As an example, for gap filling latent heat flux (Fe), a good place to start would be with the variables used in the Penman-Monteith equation: available energy (Fa), air temperature (Ta), vapour pressure deficit (VPD), wind speed (WS, because it is important for aerodynamic resistance). For CO2 flux (Fco2), a good place to start would be the variables used in a light response function and the Lloyd-Taylor respiration model: incoming shortwave radiation (Fsd), vapour pressure deficit (VPD), air temperature (Ta) and soil temperature (Ts).

The inclusion of drivers in addition to those listed above can significantly improve the performance of SOLO because they contribute information on the target at different time scales and with different phase lags. For example, Ts often lags Ta by some hours and including this as a driver may help SOLO reproduce the observed diurnal cycle in Fco2. We recommend the user experiment with different drivers to find the combination that works best for their site.

Choosing the Period for Gap Filling

The MDS gap filling method operates across the entire data set but internally uses windows of varying sizes matched to the length of the gap being filled, see Reichstein et al (2005) for details. The MDS implementation in PyFluxPro does not allow the user to altering the windowing procedure used by the MDS method.

The SOLO neural network gap filling method can be applied to windows of arbitrary length or to the entire data set and we recommend the user investigates which approach gives the best performance. The reasoning behind the default choices in PyFluxPro is give below and hopefully will encourage users to experiment and find the optimal combination of methods and window lengths for their site.

The SOLO neural network method seeks to find the combination of drivers and driver weighting that explains the most variance in the target given the constraints of network architecture and training. In doing so, it defines, at least for the period over which the neural network is trained, the relationships between the drivers and the target and holds these relationships constant over the period for which the neural network is applied. With a long window, we are expecting the relationships between the drivers and the target determined by the neural network to remain the same over the entire window. With short windows, the neural network retrains frequently and this allows the relationships between the drivers and the target to change to reflect ecosystem and seasonal dynamics.

There are various compromises to be made when deciding on the optimal window length. Short window lengths allow the neural network to better track ecosystem and seasonal changes but reduces the amount of good data available for training the network and increases the execution time.

The L5 Control File

The L5 control file consists of the following sections:

  1. Files
  2. Imports (optional)
  3. Options
  4. ustar_threshold (optional)
  5. Fluxes

The contents of these sections and how to edit them are described below.

The Files Section

Description of the Files section

The Files section allows the user to specify the path to the input and output files, the names of the input and output files, the path for plots generated by the L5 processing and the full path and name of the u* threshold results file, see the screenshot below.

 Image of the Files section in an L5 control file

The entries in the Files section are as follows:

  1. file_path - the path to the data files
  2. in_filename - the input file name
  3. out_filename - the output file name
  4. plot_path - the path for plots generated by the L5 processing
  5. cpd_filename - the full path and name of the u* threshold results file
Editing the Files Section

The entries in Files section can be edited by right clicking on the entry in the Value column or by double clicking on the entry in the Value column and manually entering the required text.

The Imports Section (optional)

Description of the Imports Section

The Imports section is optional. When present, it allows the user to import data from an external file and have this data available within PyFluxPro.  The Imports section is available via the GUI and levels L3, L4, L5 and L6.

Editing the Imports Section
Adding an Imports Section

Open the control file in the GUI (must be L3, L4, L5 or L6) and right click on the Files section.  This will display a small context-sensitive menu and one of the items in this menu will be Add Imports section.

Image of add Imports section to an L5 control file.

Select the Add Imports section menu entry and the Imports section will be added to the control file.

Image of an empty Imports section in an L5 control file.

Editing the Imports Section

Now fill in the variable name to use for the data in PyFluxPro (e.g. EVI in the example below), the file name of the external file to be read (you can right click and browse to select the file) and the name of the variable you want to import from the external file (e.g. EVI_smoothed in the example below).

Image of a completed Imports section in an L5 control file.

Adding the Imports section as above will result in PyFluxPro creating a variable called EVI using the data from the variable EVI_smoothed read from the file Calperum_250m_16_days_EVI.nc.  It will then be available to be used as a driver when filling long gaps using PyFluxPro.

The Options Section

Description of the Options section

The Options section allows the user to specify the options that control some aspects of the L5 processing. A list of the options available can be displayed by right clicking on the Options section title in the Parameter column, see the screenshot below.

Image of the Options section in an L5 control file.

The options are as follows:

  1. MaxGapInterpolate - the maximum length, in hours, of gaps to be filled by interpolation, default 3.
  2. MaxShortGapDays - the maximum length, in days, of gaps to be filled by GapFillUsingMDS and GapFillUsingSOLO, default 14. A warning is issued if longer gaps are found in the data, the user can ignore this warning, continue and gaps will be filled as described under GapFillUsingSOLO.
  3. FilterList - a comma separated list of variables to be filtered, default Fco2.
  4. TurbulenceFilter - the type of turbulence filter to be applied to the variables listed in FilterList, default ustar.
  5. DayNightFilter - variable to use when defining night time and day time, default Fsd (incoming shortwave radiation).
  6. AcceptDayTimes - accept day time data even when u* is below the threshold, default Yes.
  7. UseEveningFilter - only accept data within a specified number of hours after sunset, default No.
  8. EveningFilterLength - number of hours after sunset for which data is to be accepted, default 0 (no evening filter).
  9. Fsd_threshold - value of incoming shortwave radiation that defines day time, default 10 W/m^2.
  10. TruncateToImports - if data is imported using the Imports section, truncate the site data to the last date in the imported data, default Yes.

The default values are used if an option is not specified in the Options section.

The ustar_threshold Section (optional)

Description of ustar_threshold Section

PyFluxPro allows the user to manually specify values for the u* threshold as an alternative to automatically reading the u* threshold values from an Excel workbook generated by the u* threshold detection routines. This is done using the ustar_threshold section.

Editing the ustar_threshold Section
Adding a ustar_threshold Section

A ustar_threshold section can be added to the control file by right clicking on the Files section title in the Parameters column and selecting the "Add u* threshold section" context menu entry, see below.

Image of adding a ustar threshold section to an L5 control file.

Note that if a ustar_threshold section is present in the L5 control file, the values specified in this section are chosen over any values read in from an Excel u* threshold results file. This means that the cpd_filename entry in the Files section is ignored if the u* threshold is manually specified in the ustar_threshold section. If you manually specify a u* threshold by adding a ustar_threshold section to the L5 control then it is best to remove the cpd_filename entry from the Files section to avoid confusion.

Editing the Date Range

The ustar_threshold section consists of one or more numbered entries each of which specifies a start date, an end date and the u* threshold for that period, see the screenshot below.

Image of a ustar threshold section in an L5 control file.

The start date, end date and u* threshold value for the period are entered into the Value column, separated by commas (no spaces). The format of the start and end dates is YYYY-mm-dd HH:MM where YYYY is the year e.g. 2020, mm is the month e.g. 01, dd is the day e.g. 01, HH is the hour e.g. 00 and MM is the minute e.g. 30. These entries can be changed by double clicking on the template text in the Value column and editing the text as usual. The screenshot below shows a ustar_threshold section with 1 entry that specifies a u* threshold value or 0.25 m/s for the year of 2019.

Image of a completed ustar threshold section in an L5 control file.

Adding a Date Range

More periods can be added to the ustar_threshold section by right clicking on the ustar_threshold section title in the Parameter column and selecting Add date range from the context menu, see the screenshot below.

Image of adding a date range to the ustar threshold section in an L5 control file.

The date range can be edited as described above.

The Fluxes Section

Description of the Fluxes section

The Fluxes section is where the user specifies the variables to be gap filled, the methods to used to gap fill the variable and the gap filling method options. Each variable to be gap filled is a separate sub-section under the Fluxes section and each gap filling method to be used for the variable (MDS, SOLO and SOLO (long gaps)) is a separate sub-section under the variable sub-section. Every variable sub-section must also contain an instruction to merge the gap fill data with the original variable to produce the gap filled data (MergeSeries). The following sections describe each entry in the variable sub-section in detail.

GapFillUsingMDS

The screenshot below shows the use of the MDS gap filling method for CO2 flux (Fco2) at L5. Note that it is not possible to gap fill u* using the MDS method at present.

Image of the GapFillUsingMDS method in an L5 control file.

The entries under the GapFillUsingMDS sub-section are as follows:

  1. Fco2_MDS - the name of the variable that will contain the data generated by the MDS gap filling method.
    1. drivers - a comma separated list of variables to be used as drivers for the gap filling method. For the MDS gap filling method, only incoming shortwave radiation (Fsd), air temperature (Ta) and vapour pressure deficit (VPD) can be used and they must be listed in that order. There must be no gaps in the drivers.
    2. tolerances - a comma separated list containing 4 values with the first 2 values enclosed in parentheses. The first 2 values (enclosed in parentheses) are the bin width for Fsd for the 3 driver and 1 driver cases. The remaining 2 values are the bin widths for Ta and VPD respectively. See Reichstein et al (2005) for details on how the MDS gap filling method use the drivers and the bin widths.
GapFillUsingSOLO

The screenshot below shows the use of the SOLO neural network to gap fill the CO2 flux (Fco2). Note that any variable can be gap filled using the SOLO neural network provided the appropriate drivers are available.

Image of the GapFillUsingSOLO method in an L5 control file.

The entries under the GapFillUsingSOLO sub-section are as follows:

  1. Fco2_SOLO - the name of the variable that will contain the data generated by the SOLO gap filling method.
    1. drivers - a comma separated list of variables to be used as drivers for the gap filling method. There must be no gaps in the drivers. See the Choosing Drivers for Gap Filling section for guidance on choosing the best set of drivers for you site.
GapFillLongSOLO

The screenshot below shows the use of the SOLO neural network to fill long (>30 days) gaps the CO2 flux (Fco2). Note that any variable can be gap filled using the SOLO neural network provided the appropriate drivers are available.

Image of the GapFillLongSOLO method in an L5 control file.

The entries under the GapFillLongSOLO sub-section are the same as those described in the GapFillUsingSOLO section above. Note that the list of drivers now includes the Enhanced Vegetation Index (EVI) and that this variable is read from a specially prepared MODIS data file imported in the Imports section. The differences between GapFillUsingSOLO and GapFillLongSOLO and when to use both are described in the General sub-section of the Gap Filling section.

MergeSeries

The screenshots above for the GapFillUsingMDS, GapFillUsingSOLO and GapFillLongSOLO gap filling methods also show the use of MergeSeries to combine the observations e.g. Fco2 with the data generated by the gap filling method e.g. Fco2_MDS, Fco2_SOLO and Fco2_LONG. The order in which the variables are listed in source, left to right, determines their precedence in the merging operation. For the examples above, Fco2 (the observations) are used if they are present, the Fco2_SOLO if the observations are missing and the gap is less than 30 days in length and finally Fco2_LONG if the observations are missing and the gap is longer than 30 days.

Editing of the Fluxes section

Editing the contents of the Fluxes section is similar to editing other sections in the L5 control file. Items can be added to or removed from the section using a context-sensitive menu that is displayed when the user right clicks on the section or sub-section titles in the Parameter column. Entries in the Value column can be edited by double clicking on the text in the Value column and editing the text.

Removing a Variable

Variables can be removed from the Fluxes section by right clicking on the variable name and selecting Remove variable, see the screenshot below.

Image of removing a variable from an L5 control file.

Adding a New Variable

New variables can be added to the this section by right clicking on the Fluxes section title in the Parameter column and selecting Add variable from the displayed context menu, see the screenshot below.

Image of adding new variable to L5 control file.

The new variable is added after the last entry in the Fluxes section and is given the name <var>. You can change the position of the new variable in the Fluxes section by selecting it and dragging to the new location. The new variable is added with all gap filling methods and MergeSeries, see the screenshot below.

Image of new variable added to L5 control file.

Editing a New Variable

Unwanted gap filling methods can be removed by right clicking on the gap filling method section (GapFillUsingMDS, GapFillUsingSOLO or GapFillLongSOLO) and selecting Remove method, see the screenshot below.

Image of remove gap fill method from L5 control file.

Note that when you remove a method, the corresponding variable name is removed from MergeSeries. Once you have removed the unwanted methods, double click on all of the entries containing <var> and replace this with the variable name. If the gap filling method is one of the SOLO options, then double click on the empty entry in the Value column to the right of drivers and type in a comma separated list of driver variable names (no spaces), see the screenshot below.

Image of edit new variable in L5 control file.

Adding a Method to an Existing Variable

A gap filling method can be added to an existing variable when required, for example, if the target contains gaps longer than the threshold specified by MaxShortGapDays in the Options section. To add a gap filling method to an existing variable, right click on the variable name in the Fluxes section and select the gap filling method from the context menu, see the screenshot below.

Image of adding gap fill method to an existing variable in L5 control file.

Note that when you add a gap filling method to an existing variable, the variable name for the method is completed, a default set of drivers is defined and the method variable name is added to MergeSeries automagically, see the screenshot below.

Image of gap fill method added to an existing variable in L5 control file.

Running L5

Once the user has finished editing the L5 control file, it can be run by using the Current option of the Run entry on the main menu. The shortcut to run the current control file is Ctrl+R (press and hold down the control key and press the R key).

The GapFillUsingSOLO and GapFillLongSOLO methods have a second stage of user input when they are run. The second stage allows the user to specify options for the neural network, these are explained in later sections.

Detection of Long Gaps

PyFluxPro checks the targets for long gaps before running the gap filling methods chosen by the user. The threshold defining long gaps is specified by the MaxShortGapDays value in the Options section. The default value, used if the option is not present in the section, is 14 days.

If 1 or more of the targets has gaps longer than MaxShortGapDays and a long gap filling method has not be specified then PyFluxPro issues the following warning.

Image of the L5 long gaps warning

The user can press Continue to ignore this warning and allow PyFluxPro to proceed without a gap filling method optimised for long gaps. Pressing Quit will abort the L5 gap filling, allowing the user to return to the L5 control file and edit it to include a gap filling method optimised for long gaps.

The GapFillUsingSOLO GUI

Running the L5 control file with the GapFillUsingSOLO gap filling method will bring up a small GUI that allows the user to specify the options for the neural network, see the screenshot below. All of the neural network option fields are filled out with default values that are expected to work well in most situations but it is recommended that the user check the results of the SOLO gap filling and change the options as required.

Image of the GapFillUsingSOLO GUI

The top 2 rows show the start and end date of the data set being gap filled.

The bottom row contains 3 buttons. The Run button starts the SOLO gap fill process once the user has modified the neural network options if required. The Quit button allows the user to quit from the L5 gap filling process (before Run is pressed). Once the L5 gap filling process has completed and the user is happy with the results, pressing the Done button completes the gap filling process, merges the gap fill data with the observations and writes the L5 output file.

The Nodes, Training, Nda factor, Learning and Iterations fields specify the options used for the SOLO neural network. Nodes determines the number of clusters in data space to which the drivers will be mapped. SOLO will execute quickly for low numbers of Nodes (~5) but may not provide a good fit to the target. Higher numbers for Nodes (10 to 25) will give a better fit but SOLO will take longer to execute. The default is Auto and for this setting, SOLO will set the number of Nodes to the number of drivers plus 1. Training specifies the number of iterations over which the neural network will be trained. Increasing this number may improve the fit between the neural network predictions and the data at the cost of increased execution time. Nda factor controls the number of driver data points that a cluster (see Nodes) must contain before it will be used when predicting the target value. A low number may allow SOLO to perform better but increases the chance of over-fitting. Iterations specifies the number of iterations over which the neural network will be evaluated during validation and Learning specifies the rate at which the neural network weights can be adjusted during validation.

The Manual radio button and the Start date and End date text entry boxes are used for manual runs of the SOLO neural network where the user wishes to run SOLO for a specific period of the data set. Checking the Manual radio button and entering the start and end dates in the text entry boxes will run SOLO for that period only. This allows the user to step through the data set manually to better supervise the behaviour of the neural network or to step through a data set using windows of different lengths.

The Months and Days radio buttons and their associated text entry boxes are used for automated runs of the SOLO neural network. An automated run will step through the whole data set using the specified window size (months or days).

The Min pts (%) text entry box specifies the minimum percentage of good data that must be present in the period (manual or automatic) for the gap filling to proceed. Specifying this value ensures the neural network is trained on an adequate sample of good data.

The Auto complete check box specifies how to treat periods that do not contain sufficient good data (see Min pts (%)). When Auto complete is checked, PyFluxPro will set aside any periods with less than Min pts (%) good data and continue with the next period. Once all periods have been processed, PyFluxPro will return to those set aside and will extend these in increments of 2 days (start 1 day earlier, end 1 day later) until Min pts (%) is satisfied. This ensures that the shortest window length is used while still satisfying the Min pts (%) criterion.

The Show plots check box controls display of the plots produced by GapFillUsingSOLO. Plots are not displayed to the screen when this box is unchecked but hard copies of the plots (PNG files) are still produced. By default, PyFluxPro only displays plots when the data period being processed contains gaps. Checking the Plot all check box will cause plots to be displayed for all periods regardless of whether the target contains gaps or not.

The GapFillLongSOLO GUI

Running the L5 control file with the GapFillLongSOLO gap filling method will bring up a small GUI that allows the user to specify the options for the neural network, see the screenshot below. All of the neural network option fields are filled out with default values that are expected to work well in most situations but it is recommended that the user check the results of the SOLO gap filling and change the options as required.

Image of the GapFillLongSOLO GUI

The contents of the GUI are the same as for GapFillUsingSOLO but some of the default values are different.

For GapFillLongSOLO, the Manual radio button is checked by default and the start and end dates of the data set are entered in the Start date and End date text entry boxes. These settings mean that the SOLO neural network will be run on the whole data set at once, not in the windowed fashion used by GapFillUsingSOLO.

Output from Running L5

Plots Produced During L5

Plots of the intermediate data produced during the gap filling process are displayed on the screen and hard copies of the plots are saved as PNG files. The type of plots depends on the gap filling method and are described below. Note that hard copies (PNG files) of the plots displayed to the screen are produced by default.

Coverage Plot

The L5 processing always displays a coverage plot of the variables being gap filled before applying the gap fill methods, see the screenshot below.

Image of the L5 coverage plot

The coverage plot shows a time line for each variable with gaps in the data represented by gaps in the time lines. The variable names are shown on the left hand Y axis and the percentage of good data present in the variables being gap filled are shown on the right hand Y axis. The X axis is the date. The example shown above, from the Calperum OzFlux site, shows 9 years of data with a large gap (~3 months) in early 2014 when the flux tower and instruments were damaged by a wild fire. Note that the u* filter has been applied to Fco2, the CO2 flux, and this explains most of the difference between the coverage rates for latent heat flux, Fe, at 92% and Fco2 at 74%.

The progress of the L5 gap filling process is shown by thick lines plotted over the thin time lines of the original coverage plot, see the screenshot below.

Image of the coverage plot for partially complete L5 processing

Variable Plot for Each Window

PyFluxPro produces plots of the target and driver variables for each period used in the gap filling process. A single plot window is updated as the results become available for each target variable. The plot windows are the same for GapFillUsingSOLO and GapFillLongSOLO but the plot window for GapFillUsingMDS is slightly different.

GapFillUsingSOLO and GapFillLongSOLO

The screenshot below shows a typical plot for these methods, in this case for Fco2.

Image of plot produced for Fco2 during the L5 GapFillUsingSOLO process

The plot consists of several elements:

  1. The top section consists of time series plots for each driver and the target. The uppermost time series are the drivers with blue lines representing observations and red lines representing gap filled data. The bottom time series is the target with blue points representing observations and the red line representing the gap filled data.
  2. The bottom left plot is the diel variation of the target for the period shown in the time series. The blue line is the observations, the red line is the prediction from the gap filling method and the green line is the gap fill data for those time when the observations are present. The blue and green lines are the best comparison of observations and gap filling data.
  3. The centre plot in the bottom row is a scatter plot of the observations (Y axis) versus the gap filling data (X axis). The line of best fit is shown as a dashed, red line and the equation of best fit is given at the top of the scatter plot.
  4. The bottom right section of the plot gives details of the gap filling parameters used for this variable and period and statistics for the agreement between the gap fill method predictions and the observations. Note that we expect Slope and Offset to be 1.0 and 0 respectively due to the way in which SOLO works and r is the Pearson product-moment correlation coefficient.
GapFillUsingMDS

The screenshot below shows a typical plot for this method, in this case for Fco2.

Image of plot produced for Fco2 during the L5 GapFillUsingMDS process

Clone this wiki locally