Skip to content

Streamlining process mth5#226

Merged
kkappler merged 14 commits intomainfrom
fix_issue_223
Oct 7, 2022
Merged

Streamlining process mth5#226
kkappler merged 14 commits intomainfrom
fix_issue_223

Conversation

@kkappler
Copy link
Collaborator

@kkappler kkappler commented Sep 29, 2022

  • test on v3.9

  • Review the populating of dataset_df

    • [] Loop over survey, station, run rather than just station_id
      NO, because we only have two stations this is not needed until we have multiple station processing
    • Move the code to do this into kernel_dataset (if possible)
    • Make the "run" column use the hdf5 reference rather than run_object
  • [] Make a version of decimation that doesn't drop out and back into xarray
    NO: see issue xr-scipy #227, xarray does not seem to support this properly yet.
    However, I did create two pure xarray implementations of the decimation. One using coarsen() and one using resample(), if for no other reason than to show the syntax. Running these on the synthetic data produced reasonable results but each decimated sample is basically constructed from the mean of the decimation_factor number of samples around the time point. This is a very weak form of AAF, and I don't trust it yet. The resample command is in general very slow, whereas coarsen seems pretty fast, and they both basically do the same thing.
    These functions are called prototype_decimate_2 and prototype_decimate_3, and are in time_series_helpers.py.

  • Move sample_rate updating into the run_ts xarray object metadata dictionary (rather than modifying in-place the run_object)

Towards issue #223,
Move the method that fills out the Kernel Dataset df
into Kernel Dataset.

[Issue(s): #223]
@codecov
Copy link

codecov bot commented Sep 29, 2022

Codecov Report

Merging #226 (18755f9) into main (3a45e3a) will decrease coverage by 0.27%.
The diff coverage is 80.48%.

@@            Coverage Diff             @@
##             main     #226      +/-   ##
==========================================
- Coverage   78.03%   77.75%   -0.28%     
==========================================
  Files         101      101              
  Lines        5432     5454      +22     
==========================================
+ Hits         4239     4241       +2     
- Misses       1193     1213      +20     
Impacted Files Coverage Δ
aurora/config/metadata/decimation.py 100.00% <ø> (+8.33%) ⬆️
aurora/pipelines/time_series_helpers.py 73.88% <48.00%> (-13.01%) ⬇️
tests/time_series/test_windowing_scheme.py 90.17% <75.00%> (-1.17%) ⬇️
aurora/pipelines/process_mth5.py 97.60% <91.66%> (-0.36%) ⬇️
aurora/pipelines/run_summary.py 88.88% <100.00%> (+0.31%) ⬆️
aurora/transfer_function/kernel_dataset.py 80.00% <100.00%> (+3.12%) ⬆️
tests/parkfield/test_process_parkfield_run_rr.py 94.44% <100.00%> (ø)
tests/synthetic/test_stft_methods_agree.py 95.23% <100.00%> (+0.11%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

- Modify populate_dataset_df so that get_run_run_ts_from_mth5 is
deprecated there.
- pack "run_reference" column of kernel_dataset dataframe,
rather than putting the run_object directly in the dataframe
- remove "run" column from  kernel_dataset dataframe
- replace with method get_run_object(index_or_row)
- deprecate get_run_run_ts method in time_series_helpers

[Issue(s): #223]
There was a leftover dependency on the "run" column of tkf_dataset
dataframe.  Replaced with a call to self.get_run_object()

Also removed emtfxml_test.xml

[Issue(s):
@kkappler
Copy link
Collaborator Author

kkappler commented Oct 7, 2022

A minor decrease in code cov owes to new methods of resample that have been developed and bench tested but are not yet in the testing framework.

Due to the approaching IRIS MT Short Course, besides any bugs that need to be fixed, this PR merge will be the workshop branch and will be either released or at least tagged next Friday 14 October

@kkappler kkappler merged commit eaa17cf into main Oct 7, 2022
@kkappler kkappler deleted the fix_issue_223 branch April 1, 2023 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant