Option for running workflow in test mode #112

gplssm · 2021-02-10T08:10:04Z

Since running the workflow with the dataset for entire Germany has too long computation time for effective testing, a test mode with a subset of this data (i.e. a single federal state) is needed. Activating this test mode then instructs the workflow to run only on this subset of data.

Tasks

Implement a switch for test mode that can be toggled from the UI and CLI
Download and import the respective subset of OSM data
Download and import the respective subset of VG250 data (for federal states and below)
Select subset frpm DemandRegio
Download and import the respective subset of Zensus data
Add to docs: responsibility of authors of new datasets to provide cutting for testmode
Update README accordingly https://github.com/openego/eGon-data/blob/e53c7c292ef39dc81133321f5679527f23fcd81e/README.rst#test-mode
Add a note in CHANGELOG

gplssm · 2021-02-10T11:20:33Z

The switch for choosing the test mode can be implemented best using parameters passed to the dag as described on StackOverflow.

gplssm · 2021-02-12T10:29:34Z

I the last eGon-data webco we've decided to use a separate database for running the workflow on test data. This decision was prepared by a dicussion between @IlkaCu and @gnn.
As this issue is intended to provide a solution for using the test mode, it is heavily overlapping with this decision/discussion.

@ClaraBuettner and @gplssm decided to wait here, until we moved forward regarding setting up this separate database with test data.
@IlkaCu can we maybe help in setting up this second database?

gplssm · 2021-02-22T22:08:08Z

Since we decided today that choosing a database and choosing the testmode or not is linked, default_args seemed to be a good way to go for me. But then, I read that kwargs in default_args are kind of deprecated and won't be supported in the future.

I would like to avoid using parameters as proposed in #112 (comment). As I explained above, we already chose to use the testmode via a CLI flag. So, we don't need to tell this twice passing airflow parameters.

I simply suggest to introduce an additional argument for each function that needs to know that a run is triggered in testmode. It might make sense to use a global variable for that in pipeline.py.

ClaraBuettner · 2021-02-23T09:43:55Z

I think that's fine. We need to update every function to use the test case, adding an argument doesn't seem to be more effort.

And I wrote some lines that create a subset of vg250, I would like to push this. Can simply I create a branch for this issue which we use to implement the testcase for all datasets?

gplssm · 2021-02-23T10:20:59Z

I think that's fine. We need to update every function to use the test case, adding an argument doesn't seem to be more effort.

After my last comment I though about making use of the new, decentralized dataset.ymls. We could specify an url_testmode in addition to a normal url. But, I need to know some more details about the decentralized dataset.yml.
Meanwhile, I do an temporary approach, which I'll show by pushing it.

And I wrote some lines that create a subset of vg250, I would like to push this. Can simply I create a branch for this issue which we use to implement the testcase for all datasets?

Please, feel free to to create a branch and push your code 🚀

ClaraBuettner · 2021-02-23T12:33:28Z

Here is the branch with my commits: https://github.com/openego/eGon-data/tree/features/%23112-option-testmode
The switch for the testcase is currently the argument of the 'to_postgis'-function. I wasn't able to set this in the pipline.py, that resulted in many errors when running the dag.
On Monday we discussed the decentralization of datasets.yml, I think it will be possible to add another argument called url_testmode.

gplssm · 2021-02-23T13:46:30Z

The switch for the testcase is currently the argument of the 'to_postgis'-function. I wasn't able to set this in the pipline.py, that resulted in many errors when running the dag.

You mean passing the parameter testcase to to_postgres() function when this function is called while running the DAG in pipeline.py?
Then, use op_args of https://airflow.apache.org/docs/apache-airflow/1.10.14/_api/airflow/operators/python_operator/index.html#airflow.operators.python_operator.PythonOperator.

On Monday we discussed the decentralization of datasets.yml, I think it will be possible to add another argument called url_testmode.

👍

gplssm · 2021-02-23T18:44:01Z

We should discuss conceptionally if we want testmode to be a bool. First, I was thinking definitively so. Now, I'm considering a smaller variant of the SH testcase for CI. It would then be nice to use the same code as here...

gnn · 2021-02-23T21:50:31Z

I'm more than a bit against sprinkling such an option everywhere because it looks like noise instead of useful information. As I said on Monday, this rather looks like something which should be placed in a configuration file, or in a table storing configuration information in the database. Maybe we can talk about this tomorrow.

…enego/eGon-data into features/#112-option-testmode

ClaraBuettner · 2021-02-24T16:09:06Z

The script of @gnn which cuts the zenus data is integrated. I decided not to delete the reduced files to reduce the computation time. If you don't like this, I can also change this again. The overall execution time (including cutting zensus data) is ~ 25 min.

I also replaced the testmode variable with a string value called dataset. To run the workflow with the whole dataset, the variable has to be set to dataset = 'main'. I can easily change this name, I just didn't had a better idea.

…-substation-section

ClaraBuettner · 2021-03-05T10:36:38Z

The nep input data is now also available in test mode. I decided to scale district heating capacities per population share because the demand is more linked to population than to area.
I also tested it and it worked. Without cutting the zensus data the overall computation time is ~13min.

gplssm · 2021-03-05T10:55:28Z

I double-check that it run successfully on my side. Afterswards, I do the remaining open tasks above and create another issue for the second database stuff that @gnn is working on

gplssm · 2021-03-05T11:30:13Z

The test mode including all cutting took 29 mins for. But I think I had already most of the data on my disk and download tasks were mostly skipped

Use testdata in eGon-data workflow

gplssm · 2021-03-05T11:46:52Z

I gonna reopen for @gnn...

gplssm added 🚀 feature New feature or feature request 🧪 testing Tests and CI labels Feb 10, 2021

gplssm assigned ClaraBuettner and gplssm Feb 10, 2021

gplssm mentioned this issue Feb 23, 2021

Use testdata in eGon-data workflow #118

Merged

ClaraBuettner added a commit that referenced this issue Feb 23, 2021

Merge branch 'dev' into features/#112-option-testmode

e167774

ClaraBuettner added a commit that referenced this issue Feb 24, 2021

Merge branch 'features/#112-option-testmode' of https://github.com/op…

d5d304d

…enego/eGon-data into features/#112-option-testmode

IlkaCu added a commit that referenced this issue Mar 5, 2021

Merge branch 'features/#112-option-testmode' into features/#9-migrate…

17ede54

…-substation-section

ClaraBuettner added a commit that referenced this issue Mar 5, 2021

Merge branch 'dev' into features/#112-option-testmode

c90e419

gplssm closed this as completed in #118 Mar 5, 2021

gplssm added a commit that referenced this issue Mar 5, 2021

Merge pull request #118 from openego/features/#112-option-testmode

00a30f3

Use testdata in eGon-data workflow

gplssm reopened this Mar 5, 2021

gnn mentioned this issue Mar 15, 2021

Allow configuring the database name #159

Merged

gnn closed this as completed in #159 Mar 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option for running workflow in test mode #112

Option for running workflow in test mode #112

gplssm commented Feb 10, 2021 •

edited by gnn

Loading

gplssm commented Feb 10, 2021

gplssm commented Feb 12, 2021

gplssm commented Feb 22, 2021

ClaraBuettner commented Feb 23, 2021

gplssm commented Feb 23, 2021

ClaraBuettner commented Feb 23, 2021

gplssm commented Feb 23, 2021

gplssm commented Feb 23, 2021

gnn commented Feb 23, 2021

ClaraBuettner commented Feb 24, 2021

ClaraBuettner commented Mar 5, 2021

gplssm commented Mar 5, 2021

gplssm commented Mar 5, 2021

gplssm commented Mar 5, 2021

Option for running workflow in test mode #112

Option for running workflow in test mode #112

Comments

gplssm commented Feb 10, 2021 • edited by gnn Loading

Tasks

gplssm commented Feb 10, 2021

gplssm commented Feb 12, 2021

gplssm commented Feb 22, 2021

ClaraBuettner commented Feb 23, 2021

gplssm commented Feb 23, 2021

ClaraBuettner commented Feb 23, 2021

gplssm commented Feb 23, 2021

gplssm commented Feb 23, 2021

gnn commented Feb 23, 2021

ClaraBuettner commented Feb 24, 2021

ClaraBuettner commented Mar 5, 2021

gplssm commented Mar 5, 2021

gplssm commented Mar 5, 2021

gplssm commented Mar 5, 2021

gplssm commented Feb 10, 2021 •

edited by gnn

Loading