Add NISAR loading and auxiliary stack support#1487
Add NISAR loading and auxiliary stack support#1487ehavazli wants to merge 17 commits intoinsarlab:mainfrom
Conversation
Reviewer's GuideRefactors and extends NISAR ingestion to require a real DEM, validate inputs earlier, align DEM/mask warping precisely to the NISAR grid, and add support for geometry enhancements plus new ionosphere, troposphere, SET, and standalone water mask stacks while tightening subset and metadata handling. Sequence diagram for the updated NISAR loading and stack generation workflowsequenceDiagram
actor User
participant MintpyCLI as mintpy.load (CLI)
participant LoadData as load_data.prepare_metadata
participant PrepNisar as prep_nisar.load_nisar
participant PrepGeom as prepare_geometry
participant PrepWater as prepare_water_mask
participant PrepStackIFG as prepare_stack_ifgram
participant PrepStackION as prepare_stack_ion
participant PrepStackTRO as prepare_stack_tropo
participant PrepStackSET as prepare_stack_set
User->>MintpyCLI: run with processor=nisar
MintpyCLI->>LoadData: prepare_metadata(iDict)
Note over LoadData: Resolve DEM path
LoadData->>LoadData: validate demFile not in auto/none/no
LoadData->>LoadData: glob demFile
LoadData-->>User: raise FileNotFoundError if no DEM
Note over LoadData: Validate GUNW inputs
LoadData->>LoadData: glob GUNW unwFile pattern
LoadData-->>User: raise FileNotFoundError if no GUNW
Note over LoadData: Optional water mask
LoadData->>LoadData: only add --mask if waterMaskFile exists and not auto/none/no
LoadData->>PrepNisar: call prep_nisar.py -i GUNW -d DEM [--mask MASK]
PrepNisar->>PrepNisar: load_nisar(inps)
PrepNisar->>PrepNisar: glob input_glob for GUNW
PrepNisar-->>User: raise FileNotFoundError if no GUNW
PrepNisar->>PrepNisar: validate dem_file string and file existence
PrepNisar-->>User: raise ValueError/FileNotFoundError on invalid DEM
PrepNisar->>PrepNisar: extract_metadata(input_files, bbox, polarization)
PrepNisar->>PrepGeom: prepare_geometry(geometryGeo.h5, metaFile, bbox, demFile, maskFile, polarization)
PrepGeom->>PrepGeom: read_and_interpolate_geometry
PrepGeom-->>PrepNisar: geometryGeo.h5 metadata
alt mask_file provided
PrepNisar->>PrepWater: prepare_water_mask(waterMask.h5, metaFile, metadata, bbox, maskFile, polarization)
PrepWater->>PrepWater: warp mask to NISAR grid and apply valid_unw mask
PrepWater-->>PrepNisar: waterMask.h5
end
PrepNisar->>PrepStackIFG: prepare_stack(ifgramStack.h5, inp_files, metadata, demFile, bbox, date12_list, polarization, ifgram)
PrepStackIFG->>PrepStackIFG: _read_stack_observation(stack_type=ifgram)
PrepStackIFG-->>PrepNisar: ifgramStack.h5
PrepNisar->>PrepStackION: prepare_stack(ionStack.h5, ..., stack_type=ion)
PrepStackION->>PrepStackION: _read_stack_observation(stack_type=ion)
PrepStackION-->>PrepNisar: ionStack.h5
PrepNisar->>PrepStackTRO: prepare_stack(tropoStack.h5, ..., stack_type=tropo)
PrepStackTRO->>PrepStackTRO: read_and_interpolate_troposphere
PrepStackTRO-->>PrepNisar: tropoStack.h5
PrepNisar->>PrepStackSET: prepare_stack(setStack.h5, ..., stack_type=set)
PrepStackSET->>PrepStackSET: read_and_interpolate_SET
PrepStackSET-->>PrepNisar: setStack.h5
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The logic in
prepare_stackandload_nisarbranches on stack type by matching substrings of the output path (e.g.,'inputs/ifgramStack.h5' in outfile), which is brittle; consider passing an explicitstack_typeenum/flag to make the behavior independent of filename conventions. - DEM warping and radar-grid interpolation are implemented separately in
read_and_interpolate_geometry,read_and_interpolate_troposphere, andread_and_interpolate_SET; you could factor out the common warp and interpolation scaffolding to reduce duplication and keep the behavior consistent across geometry/tropo/SET products. - In
_read_raster_epsg, the code assumes a projection with an EPSGAUTHORITY; for robustness, consider handling cases where the projection is missing or uses a different authority (e.g., derive CRS via WKT parsing or raise a clearer error that includes the raw projection string).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The logic in `prepare_stack` and `load_nisar` branches on stack type by matching substrings of the output path (e.g., `'inputs/ifgramStack.h5' in outfile`), which is brittle; consider passing an explicit `stack_type` enum/flag to make the behavior independent of filename conventions.
- DEM warping and radar-grid interpolation are implemented separately in `read_and_interpolate_geometry`, `read_and_interpolate_troposphere`, and `read_and_interpolate_SET`; you could factor out the common warp and interpolation scaffolding to reduce duplication and keep the behavior consistent across geometry/tropo/SET products.
- In `_read_raster_epsg`, the code assumes a projection with an EPSG `AUTHORITY`; for robustness, consider handling cases where the projection is missing or uses a different authority (e.g., derive CRS via WKT parsing or raise a clearer error that includes the raw projection string).
## Individual Comments
### Comment 1
<location path="src/mintpy/prep_nisar.py" line_range="281-266" />
<code_context>
+ polarization=pol,
)
+ # ifgram stack
prepare_stack(
outfile=stack_file,
inp_files=input_files,
metadata=metadata,
+ demFile=inps.dem_file,
+ bbox=bounds,
+ date12_list=date12_list,
+ polarization=pol,
+ )
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** The same `prepare_stack` helper is used for multiple stack types but relies on filename substrings, which is brittle and hard to maintain.
Inside `prepare_stack`, behavior is selected by checking substrings in `outfile` (e.g. `"inputs/ifgramStack.h5"`, `"inputs/ionStack.h5"`). This couples logic to file naming and will silently break if filenames or directory layout change.
Instead, pass an explicit `stack_type` argument (e.g. `"ifgram" | "ion" | "tropo" | "set"`) and branch on that. This makes the function more robust, readable, and easier to extend.
Suggested implementation:
```python
prepare_stack(
outfile=stack_file,
inp_files=input_files,
metadata=metadata,
demFile=inps.dem_file,
bbox=bounds,
date12_list=date12_list,
polarization=pol,
stack_type="ifgram",
)
```
```python
def prepare_stack(
outfile,
inp_files,
metadata,
demFile=None,
bbox=None,
date12_list=None,
polarization=None,
stack_type=None,
):
```
```python
# Determine stack type: prefer explicit stack_type, fall back to legacy
# filename-based inference for backward compatibility.
effective_stack_type = stack_type
if effective_stack_type is None:
if "inputs/ifgramStack.h5" in outfile:
effective_stack_type = "ifgram"
elif "inputs/ionStack.h5" in outfile:
effective_stack_type = "ion"
elif "inputs/tropoStack.h5" in outfile:
effective_stack_type = "tropo"
elif "inputs/setStack.h5" in outfile:
effective_stack_type = "set"
else:
raise ValueError(
f"Unable to infer stack_type from outfile '{outfile}'. "
"Please pass stack_type explicitly (e.g. 'ifgram', 'ion', 'tropo', 'set')."
)
if effective_stack_type == "ifgram":
```
```python
elif effective_stack_type == "ion":
```
```python
elif effective_stack_type == "tropo":
```
```python
elif effective_stack_type == "set":
```
There are likely additional call sites to `prepare_stack` in `src/mintpy/prep_nisar.py` (for ion, tropo, and set stacks) that are not visible in the snippet. To fully adopt the new API and avoid relying on filename inference:
1. Find all remaining `prepare_stack(` calls and add an explicit `stack_type` keyword argument:
- For ionosphere stacks, pass `stack_type="ion"`.
- For troposphere stacks, pass `stack_type="tropo"`.
- For set stacks, pass `stack_type="set"`.
2. If any code outside this file calls `prepare_stack`, consider updating those callers to pass an explicit `stack_type` as well. The current implementation keeps backward compatibility, but explicit types are preferred going forward.
3. If there are additional stack types beyond `ifgram`, `ion`, `tropo`, and `set`, extend the `effective_stack_type` inference block and the branching logic accordingly, and use consistent string identifiers when calling `prepare_stack`.
</issue_to_address>
### Comment 2
<location path="src/mintpy/prep_nisar.py" line_range="718-727" />
<code_context>
+def read_and_interpolate_troposphere(
</code_context>
<issue_to_address>
**suggestion:** The `mask_file` parameter in `read_and_interpolate_troposphere` is unused, which suggests either incomplete masking logic or an unnecessary argument.
`mask_file` is accepted but never used, unlike in `read_and_interpolate_geometry`, so callers may assume masking is happening when it is not.
Please either remove `mask_file` from this function and its callers, or implement masking here in line with `read_and_interpolate_geometry` so the helpers behave consistently.
Suggested implementation:
```python
def read_and_interpolate_troposphere(
gunw_file, dem_file, xybbox, polarization="HH"
):
```
You will need to:
1. Find all call sites of `read_and_interpolate_troposphere` in the codebase.
2. Remove any `mask_file` argument passed to this function, so that calls follow the updated signature:
- Before: `read_and_interpolate_troposphere(gunw_file, dem_file, xybbox, polarization, mask_file)`
- After: `read_and_interpolate_troposphere(gunw_file, dem_file, xybbox, polarization)`
If any callers relied on the presence of `mask_file` for planned behavior, you may instead decide to implement masking (mirroring `read_and_interpolate_geometry`) and reintroduce it with concrete usage.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Removed redundant return statement from the function.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
prepare_stack,meta['FILE_TYPE']is hard-coded to'ifgramStack'even when generating ionosphere, troposphere, or SET stacks; consider setting this dynamically perstack_typeso downstream consumers can distinguish stack types correctly. _read_raster_epsgusesgdal.osr.SpatialReference(), butosris not imported anywhere; switch tofrom osgeo import osrand useosr.SpatialReference()(or otherwise ensure the OSR module is correctly imported) to avoid runtime errors.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `prepare_stack`, `meta['FILE_TYPE']` is hard-coded to `'ifgramStack'` even when generating ionosphere, troposphere, or SET stacks; consider setting this dynamically per `stack_type` so downstream consumers can distinguish stack types correctly.
- `_read_raster_epsg` uses `gdal.osr.SpatialReference()`, but `osr` is not imported anywhere; switch to `from osgeo import osr` and use `osr.SpatialReference()` (or otherwise ensure the OSR module is correctly imported) to avoid runtime errors.
## Individual Comments
### Comment 1
<location path="src/mintpy/prep_nisar.py" line_range="150" />
<code_context>
+ if not projection:
+ raise ValueError(f"Raster has no projection metadata: {path}")
+
+ srs = gdal.osr.SpatialReference()
+ if srs.ImportFromWkt(projection) != 0:
+ raise ValueError(
</code_context>
<issue_to_address>
**issue (bug_risk):** Using gdal.osr.SpatialReference may fail if osr is not exposed under gdal in the installed GDAL bindings.
The Python bindings usually expose `osr` via `from osgeo import osr`, not `gdal.osr`. On many setups `gdal.osr` won’t exist and this will raise an AttributeError at runtime. Import `osr` explicitly (e.g. at the module top) and use `osr.SpatialReference()` here to keep `_read_raster_epsg` working across environments.
</issue_to_address>
### Comment 2
<location path="src/mintpy/prep_nisar.py" line_range="1063" />
<code_context>
}
- # initiate HDF5 file
meta["FILE_TYPE"] = "ifgramStack"
+
writefile.layout_hdf5(outfile, ds_name_dict, metadata=meta)
</code_context>
<issue_to_address>
**issue (bug_risk):** FILE_TYPE is hard-coded to ifgramStack even when preparing ion/tropo/set stacks.
Since `meta["FILE_TYPE"]` is always set to `"ifgramStack"`, stacks written as `ionStack.h5`, `tropoStack.h5`, or `setStack.h5` will carry incorrect metadata, which can break any logic branching on `FILE_TYPE`.
Instead, derive `FILE_TYPE` from `effective_stack_type`, for example:
```pythonile_type_map = {
"ifgram": "ifgramStack",
"ion": "ionStack",
"tropo": "tropoStack",
"set": "setStack",
}
meta["FILE_TYPE"] = file_type_map[effective_stack_type]
```
so the metadata consistently reflects the actual stack type.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
I ran this on a non-redundant stack of 4 nearest-neighbor GUNWs and confirmed that the dataset was loaded. It produced the expected ifgramStack.h5 and geometryGeo.h5, as well as auxiliary outputs including ionStack.h5, setStack.h5, and tropoStack.h5. |
src/mintpy/load_data.py
Outdated
| dem_files = glob.glob(str(dem_file)) | ||
| if len(dem_files) == 0: | ||
| raise FileNotFoundError( | ||
| f'No DEM file found for mintpy.load.demFile: {dem_file}' | ||
| ) | ||
| dem_file = dem_files[0] |
There was a problem hiding this comment.
If the user has provided a dem_file, why would you need to still look for the dem_files, and then choose the first in the list. What is the reason the first one is the correct one? This logic does not make sense to me.
I suggest to trust the user input and avoid looking further
| # --------------------------------------------------------------------- | ||
| # Constants / HDF5 paths (GUNW frequencyA, unwrappedInterferogram) | ||
| # --------------------------------------------------------------------- | ||
| DATASET_ROOT_UNW = "/science/LSAR/GUNW/grids/frequencyA/unwrappedInterferogram" |
There was a problem hiding this comment.
I understand that we usually have frequencyA. But it would have been nice to not limit the user to frequesnyA. In theory one can run GUNW for frequencyB only and have a stack of frequency B interferograms and they should be still able to run mintpy.
| # standalone water mask (MintPy format) | ||
| if getattr(inps, "mask_file", None) not in [None, "None", "auto"]: | ||
| water_mask_file = os.path.join(inps.out_dir, "waterMask.h5") | ||
| prepare_water_mask( |
There was a problem hiding this comment.
I don't see the mask dataset in NISAR GUNW been used. I was wondering if there is a reason to not use this dataset which exists in this path:
"science/LSAR/GUNW/grids/frequencyA/unwrappedInterferogram/mask"
The description of this mask based on GUNW spec is the following:
Combination of water mask and a mask of subswaths of valid samples in the reference RSLC and geometrically-coregistered secondary RSLC. Each pixel value is a three-digit number: the most significant digit represents the water flag of that pixel the reference RSLC, where 1 is water and 0 is non-water; the second digit represents the subswath number of that pixel in the reference RSLC; the least significant digit represents the subswath number of that pixel in the secondary RSLC. A value of 0 in either subswath digit indicates an invalid sample in the corresponding RSLC
Can you write a python script to interpret the mask
As you see the mask is not only water mask but a combination of water mask and subswath mask for reference and secondary RSLCs. The subswath mask is needed to mask out edges of the swath. while for dithered PRF we have one subswath, the mask nicely identifies the valid region of multiple subswaths in constant PRF.
assuming the mask dataset in GUNW, you can extract different binary masks as
water_mask = (mask // 100) == 1
ref_subswath = (mask // 10) % 10
sec_subswath = mask % 10
is_valid = (ref_subswath > 0) & (sec_subswath > 0)
is_land_and_valid = (is_valid) & ~water_mask
Here is a plot of the mask dataset for a frame at coastal region (LA):
And here is a plot of valid region over land (i.e., is_land_and_valid in the code snippet above)
| ) | ||
|
|
||
| y_2d, x_2d = np.meshgrid(ycoord, xcoord, indexing="ij") | ||
| valid_mask = _read_valid_unw_mask(gunw_file, xybbox, polarization) |
There was a problem hiding this comment.
See my comment below about mask dataset in GUNW
Summary
This PR improves the NISAR data-loading workflow in MintPy by tightening input validation in
load_data.pyand expandingprep_nisar.pyto generate additional derived products.What changed
processor=nisarWhy
These changes make the NISAR ingestion path more robust and more complete. They reduce configuration errors, improve failure messages, and produce a richer set of MintPy-ready outputs for downstream analysis.
Testing
prep_nisar.pywrites geometry, interferogram, ionosphere, troposphere, SET, and water mask outputs with the updated workflowSummary by Sourcery
Tighten NISAR ingestion by validating DEM/GUNW inputs, aligning DEM/mask warping to the NISAR grid, and generating additional NISAR-derived MintPy stacks and water masks.
New Features:
Bug Fixes:
Enhancements: