Streamlining process mth5 by kkappler · Pull Request #226 · simpeg/aurora

kkappler · 2022-09-29T16:20:43Z

test on v3.9
Review the populating of dataset_df
- [] Loop over survey, station, run rather than just station_id
  NO, because we only have two stations this is not needed until we have multiple station processing
- Move the code to do this into kernel_dataset (if possible)
- Make the "run" column use the hdf5 reference rather than run_object
[] Make a version of decimation that doesn't drop out and back into xarray
NO: see issue xr-scipy #227, xarray does not seem to support this properly yet.
However, I did create two pure xarray implementations of the decimation. One using coarsen() and one using resample(), if for no other reason than to show the syntax. Running these on the synthetic data produced reasonable results but each decimated sample is basically constructed from the mean of the decimation_factor number of samples around the time point. This is a very weak form of AAF, and I don't trust it yet. The resample command is in general very slow, whereas coarsen seems pretty fast, and they both basically do the same thing.
These functions are called prototype_decimate_2 and prototype_decimate_3, and are in time_series_helpers.py.
Move sample_rate updating into the run_ts xarray object metadata dictionary (rather than modifying in-place the run_object)

Towards issue #223, Move the method that fills out the Kernel Dataset df into Kernel Dataset. [Issue(s): #223]

codecov · 2022-09-29T16:37:26Z

Codecov Report

Merging #226 (18755f9) into main (3a45e3a) will decrease coverage by 0.27%.
The diff coverage is 80.48%.

@@            Coverage Diff             @@
##             main     #226      +/-   ##
==========================================
- Coverage   78.03%   77.75%   -0.28%     
==========================================
  Files         101      101              
  Lines        5432     5454      +22     
==========================================
+ Hits         4239     4241       +2     
- Misses       1193     1213      +20

Impacted Files	Coverage Δ
aurora/config/metadata/decimation.py	`100.00% <ø> (+8.33%)`	⬆️
aurora/pipelines/time_series_helpers.py	`73.88% <48.00%> (-13.01%)`	⬇️
tests/time_series/test_windowing_scheme.py	`90.17% <75.00%> (-1.17%)`	⬇️
aurora/pipelines/process_mth5.py	`97.60% <91.66%> (-0.36%)`	⬇️
aurora/pipelines/run_summary.py	`88.88% <100.00%> (+0.31%)`	⬆️
aurora/transfer_function/kernel_dataset.py	`80.00% <100.00%> (+3.12%)`	⬆️
tests/parkfield/test_process_parkfield_run_rr.py	`94.44% <100.00%> (ø)`
tests/synthetic/test_stft_methods_agree.py	`95.23% <100.00%> (+0.11%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

- Modify populate_dataset_df so that get_run_run_ts_from_mth5 is deprecated there. - pack "run_reference" column of kernel_dataset dataframe, rather than putting the run_object directly in the dataframe - remove "run" column from kernel_dataset dataframe - replace with method get_run_object(index_or_row) - deprecate get_run_run_ts method in time_series_helpers [Issue(s): #223]

There was a leftover dependency on the "run" column of tkf_dataset dataframe. Replaced with a call to self.get_run_object() Also removed emtfxml_test.xml [Issue(s):

… decimated_sample_rate

[Issue(s): #223]

…adata and mth5

…ated with TFs

kkappler · 2022-10-07T23:53:27Z

A minor decrease in code cov owes to new methods of resample that have been developed and bench tested but are not yet in the testing framework.

Due to the approaching IRIS MT Short Course, besides any bugs that need to be fixed, this PR merge will be the workshop branch and will be either released or at least tagged next Friday 14 October

kkappler added 2 commits September 29, 2022 08:20

test on v3.9

06a3ab7

Add add_columns_for_processing to kernel_dataset

e84acab

Towards issue #223, Move the method that fills out the Kernel Dataset df into Kernel Dataset. [Issue(s): #223]

kkappler added 12 commits September 30, 2022 09:25

Fix bug

8a26cdc

There was a leftover dependency on the "run" column of tkf_dataset dataframe. Replaced with a call to self.get_run_object() Also removed emtfxml_test.xml [Issue(s):

add a note about test to be deprecated

9878399

minor changes

38dffc4

minor changes

a15930a

update description, and remove incorrectly named, and unused function…

21e3784

… decimated_sample_rate

minor doc changes

b24195e

update docstrings, change of variable name

1ebefa9

Move populating dataset_df into KernelDataset

e333713

[Issue(s): #223]

add examples of pure xarray decimation, #223, #227

97477b9

add all py versions into test yml, and push to test on updated mt_met…

1034165

…adata and mth5

add a fix to run summary to it ignores rows of channel_summary associ…

18755f9

…ated with TFs

kkappler merged commit eaa17cf into main Oct 7, 2022

kkappler deleted the fix_issue_223 branch April 1, 2023 21:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streamlining process mth5#226

Streamlining process mth5#226
kkappler merged 14 commits intomainfrom
fix_issue_223

kkappler commented Sep 29, 2022 •

edited

Loading

Uh oh!

codecov bot commented Sep 29, 2022 •

edited

Loading

Uh oh!

kkappler commented Oct 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kkappler commented Sep 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Sep 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kkappler commented Oct 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kkappler commented Sep 29, 2022 •

edited

Loading

codecov bot commented Sep 29, 2022 •

edited

Loading