Skip to content

Running list of TODOs #9

@orianac

Description

@orianac

For MVP for Washington:

  • basic form of training dataset:
    • All glas shots translated to biomass using one allometric equation (Cindy) [done]
    • Look up sampling strategy of GLAS and allometric equation assumptions wrt leaf conditions (Cindy/Ori) [done]
    • Calculate seasonal average for each year from Landsat with spatially continuous map for WA (Ori + Joe) (relatedly, decide on Landsat data structure) (snap to a uniform Hansen 30m grid x annually)
    • Extract Landsat variables to use into a tabular format (all raw bands)
  • set up ML model for training (Cindy)
    • random forest + XGBoost!
    • set up inference function
  • Set up inference inputs
    • extract the same landsat variables into tabular format for all of washington
  • Plotting function from ML model output (altair)(Ori) (lat/lon/time)
    • spatial maps
    • time series
  • Set up validation dataset
    • Find 4 well-respected datasets

To expand to global:

  • Transforming Harris et al spreadsheet into python
    • Mask of column 2 (ecoregion + NLCD) -> allometric equation
    • allometric equation = dictionary of functions
    • height metrics = another dictionary of functions [done]
    • parameter to indicate whether to preprocess (whether input is smooth or raw)

Improvements by April:
GLAS/biomass:

  • apply glas filtering based on Harris et al (Cindy) [done]
  • double check how GLAS elevation should be calculated from GLAH14 data
  • decide whether we should use smoothed or raw wf to make height metric calculations
  • Double check terrain calculations by reading Duncanson et al more closely
  • potentially change the raw extracted glas data into the original variable name
  • interpolate between bins (currently at 15cm intervals)
  • double check that compression ratio does not change during the valid signal part (between sig beg and sig end)
  • Figure out which allometric equations can be used for leaf off conditionsAllometric equations are trained predominantly upon leaf-on conditions, so we should determine whether estimates for leaf-off conditions are valid. This is relevant for our reporting/updating interval- proposal: update bi-annually after the end of the growing season in each hemisphere (September and March(?)).

Landsat

  • Masking clouds (potentially via https://github.com/ubarsc/python-fmask or potentially using *_BQA.TIF files in LANDSAT archive
  • Smoothing LANDSAT images using CCDC
  • Grabbing multiple LANDSAT pixels for each GLAS record? GLAS has 70 m diameter and LANDSAT is 30m so could use 4 LANDSAT? Bounding box of all LANDSAT pixels?

ML model

  • Training different model for each ecoregion
  • Incorporating a climate dataset into the training of the model (Others have used Worldclim, though we could use Terraclim)
  • out of sample validation

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions