Skip to content

Processing Dataset Definition #118

@kkappler

Description

@kkappler

We have a processing config. The other part of the TFKernel is the dataset that gets fed into the pipeline along with the config.

We need a standard for dataset-specification.

I suggest a table (dataframe) as the container, with a csv file as a first cut user interface

Dataset Specification can be one of two flavors (with others added in future):

  1. Single Station
  2. Remote Reference

In both cases we need to know:
Local Station ID: This is the location at which we are going to estimate the EMTF (sample the earths' conductivity)
Local Station Time Intervals of data to be provided for analysis

For Remote Station we also need:
Reference Station ID (Can be None, and then there are not two cases of definition?)
Refernece station time Intervals of data to be provided for analysis

Specifications:
-The time intervals for any given station must be disjoint

I would specifically like to push the logic that validates the time intervals:

  • data exists,
  • data location
  • RR data available for all intervals in dataset_definition.csv
  • etc.

out of the first cut of this class. Those tools can be built separately, and indeed Tim is already making good headway on these validations.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions