Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable bandwidths for scalars and timeseries #52

Open
henhuy opened this issue May 5, 2022 · 8 comments
Open

Enable bandwidths for scalars and timeseries #52

henhuy opened this issue May 5, 2022 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@henhuy
Copy link

henhuy commented May 5, 2022

A new OEDatamodel version shall be developed, which allows bandwidths for scalars and timeseries per scenario.
IMO this needs three implementation changes:

  1. value in scalars must be changed to hold bandwidths (allowing fixed, discrete and continuous values)
  2. multiple timeseries regarding same region/tech/etc. are allowed, but must be distinguishable by user
  3. a concrete instance of the scenario using fixed values and fixed timeseries from scenario with bandwiths must be build

Possible implementations can be discussed here.

@henhuy
Copy link
Author

henhuy commented May 5, 2022

My suggestion for task one (bandwidths in scalars), is to change type of column value (now float) into list-of-floats and to add column bandwith_type (or different name) of type string containing one of fixed/discrete/continuous.

@henhuy
Copy link
Author

henhuy commented May 5, 2022

Regarding task two, we could use tag column for explanations about the timeseries

@henhuy
Copy link
Author

henhuy commented May 5, 2022

Task three could be implemented, by adding three new tables scenario_instance, scalar_instances, timeseries_instances (search for better names!) with following functions:

  • scenario_instance simply contains PK and FK to instantiated scenario
  • scalar_instances contains columns PK, FK to scenario_instance, FK to scalar (with bandwiths) and value (float) holding a valid value within bandwidth of related scalar entry
  • timeseries_instances contains PK, FK to scenario_instance and FK to timeseries

From those tables all needed information for one reference scenario can be gathered...

@henhuy henhuy self-assigned this May 5, 2022
@henhuy henhuy added the enhancement New feature or request label May 5, 2022
@jnnr
Copy link

jnnr commented May 23, 2022

Great initiative!
Regarding 1., I suggest to allow for fixed scalars, lists of scalars and distributions. Distributions can be specified by a finite number of parameters, i.e. a uniform distribution between an upper and lower bound, a Gaussian characterized by mean and variance etc. Together with a sampling method, this is enough information to produce a concrete scenario instance.

Similarly for 2., but for timeseries in most of the cases you would have a fixed timeseries or a list of timeseries that can be used to sample from.

@chrwm
Copy link
Member

chrwm commented May 30, 2022

Hi @jnnr
Regarding 1., @srhbrnds, @henhuy and I discussed your suggestion to "specify distributions by a finite number of parameters" and were wondering what the use-case of this could be? Do I understand you correctly that in essence, it wouldn't change henhuy suggestion other than allowing more values of type string other than fixed/discrete/continuous, for example Gaussian. I suppose in that case somewhere would need to be defined in which order mean, variance, etc. are given.

@jnnr
Copy link

jnnr commented May 30, 2022

Hi @chrwm!
The usecase would be the similar to the usecase of that feature in general: Support creation of several concrete scenarios by specifing some higher-level parameters. This can be applied in scenario comparison and sensitivity analysis.

You are right, this is an extension of @henhuy s proposal. In fact, I wonder what "discrete" or "continous" should mean without defining some distribution to sample from?

To make it a bit more concrete:

  • fixed: pass a single value
  • list: pass a list of values
  • continous: pass parameters of a distribution (in the correct order), e.g.
    • uniform: lower_bound, upper_bound
    • gaussian: mean, variance
    • etc.

update replaced "discrete" with "list", as you could think of discrete distributions as well. List would imply that all values are equally probable to be drawn.

@chrwm
Copy link
Member

chrwm commented Jul 28, 2022

The bandwidth_types will be based on NetCDF conventions and extended with custom conventions from the SEDOS project. Your proposals and definitions can be included as the latter.

@chrwm
Copy link
Member

chrwm commented Jul 28, 2022

@chrwm from 2022-07-28 AP1 meeting:

  • in the case of performance issues due to many timeseries per csv-table consider ANSI-encoding instead of UTF-8
  • then make sure to address encoding errors script-based

@chrwm chrwm self-assigned this Jul 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants