-
Notifications
You must be signed in to change notification settings - Fork 32
Enable batch run of trim_results option #505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Enable batch run of trim_results option #505
Conversation
Addresses issue #504.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few comments centered around two suggestions: 1) splitting trim_results into two arguments, a bool and an int and 2) updates to the logic for applying year intervals.
docs/config_readable.yml
Outdated
trim_results: (string) Report limited results variables to reduce | ||
results file size. String indicates whether longer-than-annual | ||
reporting interval should also be used ('all_yrs' reports | ||
annually). Allowed values are {all_yrs, every_five_yrs, every_other_yr, | ||
every_ten_yrs, null}. Default null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like turning on this argument does two things: 1) trims out variables, and 2) optionally reduces the number of years reported. It might be better to break it into two arguments to align with this, but also to enable a more flexible way of trimming the reported years:
A new argument trim_results_interval (int)
that takes the interval on which to report. 1 would not do anything, 2 --> every_other_yr
, 5 --> every_five_yrs
, etc. This would make the implementation in gen_trim_yrs
cleaner too.
Keep trim_results (bool)
but change to a boolean that either trims those variables or not. This also gives more control for someone who may want to limit the # of years but still wants all the output variables. Default it to False.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idea. Here is how I've organized it in the YAML:
change_yr_interval:
type: ["integer", "null"]
default: null
description: Change reporting to every N years, where N is the integer specified in this argument.
trim_vars:
type: boolean
default: false
description: If true, a reduced set of only essential variables is reported.
scout/run.py
Outdated
if "every_other" in yr_interval: | ||
n_yrs = 2 | ||
elif "every_five" in yr_interval: | ||
n_yrs = 5 | ||
elif "every_ten" in yr_interval: | ||
n_yrs = 10 | ||
else: | ||
n_yrs = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If following the above suggestion, this could be removed, and n_yrs
is simply the values of trim_results_interval
inputted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will need to add some logic to avoid trimming with an interval > len(years)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, modified as suggested.
for year in range(start_yr + 1, end_yr + 1): | ||
# Year must be exactly divisible by desired year interval | ||
if year % n_yrs == 0: | ||
years.append(year) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sort of arbitrarily decides the first year reported after the baseline, so if n_yrs is 10, then it outputs (2025, 2030, 2040), because 2030%10==0. I would expect it to instead output (2025, 2035, 2045...). To correct this I think check should be based on the time from the start_year, for example:
for year in range(start_yr + 1, end_yr + 1): | |
# Year must be exactly divisible by desired year interval | |
if year % n_yrs == 0: | |
years.append(year) | |
for i, year in enumerate(range(start_yr + 1, end_yr + 1)): | |
# Year must be exactly divisible by desired year interval | |
if (i + 1) % n_yrs == 0: | |
years.append(year) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current implementation is nice b/c the user can easily encapsulate years that would be of interest for target setting, e.g., if we want to explore 2030 and 2050, just set an interval of every 10, if we want 2030, 2035, and 2050, do every five, etc. I worry that if we make this change, we start to break easy alignment with those potential target years of interest, which it's important this feature supports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I don't think the description of change_yr_interval is entirely accurate in that case, or at least not descriptive enough. Maybe add detail that we output all years divisible by the interval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment below for a possible issue with this implementation. At the risk of adding too many related arguments, we could also have an output_years
argument, that takes a list of specific years. So one could output an interval which is based off of the first year, but then ensure specific years are also output through this new argument, e.g., change_yr_interval=2
, ouput_years=[2030, 2035]
results in (2025, 2027, 2029, 2030, 2031, 2033, 2035...)
scout/run.py
Outdated
if opts.trim_results == "all_yrs": | ||
trim_yrs = False | ||
else: | ||
trim_yrs = gen_trim_yrs(opts.trim_results, handyvars.aeo_years) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if opts.trim_results == "all_yrs": | |
trim_yrs = False | |
else: | |
trim_yrs = gen_trim_yrs(opts.trim_results, handyvars.aeo_years) | |
if opts.trim_results_interval: | |
trim_yrs = gen_trim_yrs(opts.trim_results_interval, handyvars.aeo_years) | |
else: | |
trim_yrs = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ended up in a similar place:
# User desires trimmed down variable reporting
if opts.trim_vars:
trim_out = True
else:
trim_out = False
# User desires trimmed down year intervals
if opts.change_yr_interval not in [None, 1]:
# Check length of AEO years
aeo_len = len(handyvars.aeo_years)
# Ensure length of year reporting interval doesnt exceed the full length of time horizon
if opts.change_yr_interval > aeo_len:
opts.change_yr_interval = aeo_len
# Notify user of the change
warnings.warn(
"'trim_yrs' user option exceeds length of time horizon. Resetting to the "
"time horizon length of " + str(aeo_len) + " years")
trim_yrs = gen_trim_yrs(opts.change_yr_interval, handyvars.aeo_years)
else:
trim_yrs = False
scout/run.py
Outdated
if opts.trim_results: | ||
trim_out = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if opts.trim_results: | |
trim_out = True | |
if opts.trim_results: | |
trim_out = True | |
else: | |
trim_out = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comment. Changed trim_results
to trim_vars
for clarity.
scout/run.py
Outdated
else: | ||
trim_out, trim_yrs = (False for n in range(2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else: | |
trim_out, trim_yrs = (False for n in range(2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above for implementation.
* trim_results is now two arguments, one that trims down variables and another that trims down years in reporting.
aeo_len = len(handyvars.aeo_years) | ||
# Ensure length of year reporting interval doesnt exceed the full length of time horizon | ||
if opts.change_yr_interval > aeo_len: | ||
opts.change_yr_interval = aeo_len |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the desired behavior here? If selecting years based on if year % yr_interval == 0:
, then setting to aeo_len could output 2025 and 2028 (divisible by 26), which is probably not what anyone wants. should you instead output just the first year and the last year? That could be done by not calling gen_trim_yrs
if this is true, and instead just setting trim_yrs
directly
When running a batch simulation with
trim_results
set to true, the user will be prompted to enter year ranges when executing run.py.To avoid this, the user now specifies
trim_results
as a string that indicates the desired year range setting when they want a trimmed down results file. One of the options,all_yrs
, just trims down the variables reported out without changing from the annual interval. The other three options allow reporting at 2-, 5-, and 10-year intervals, always including the start year in the final set of years reported.The string entered in
trim_results
is post-processed to a list of focus years inrun
.Addresses issue #504.