Skip to content

Conversation

jtlangevin
Copy link
Collaborator

When running a batch simulation with trim_results set to true, the user will be prompted to enter year ranges when executing run.py.

To avoid this, the user now specifies trim_results as a string that indicates the desired year range setting when they want a trimmed down results file. One of the options, all_yrs, just trims down the variables reported out without changing from the annual interval. The other three options allow reporting at 2-, 5-, and 10-year intervals, always including the start year in the final set of years reported.

The string entered in trim_results is post-processed to a list of focus years in run.

Addresses issue #504.

@jtlangevin jtlangevin linked an issue Jun 6, 2025 that may be closed by this pull request
@jtlangevin jtlangevin requested review from aspeake and trynthink June 6, 2025 18:16
Copy link
Collaborator

@aspeake aspeake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few comments centered around two suggestions: 1) splitting trim_results into two arguments, a bool and an int and 2) updates to the logic for applying year intervals.

Comment on lines 165 to 169
trim_results: (string) Report limited results variables to reduce
results file size. String indicates whether longer-than-annual
reporting interval should also be used ('all_yrs' reports
annually). Allowed values are {all_yrs, every_five_yrs, every_other_yr,
every_ten_yrs, null}. Default null
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like turning on this argument does two things: 1) trims out variables, and 2) optionally reduces the number of years reported. It might be better to break it into two arguments to align with this, but also to enable a more flexible way of trimming the reported years:

A new argument trim_results_interval (int) that takes the interval on which to report. 1 would not do anything, 2 --> every_other_yr, 5 --> every_five_yrs, etc. This would make the implementation in gen_trim_yrs cleaner too.

Keep trim_results (bool) but change to a boolean that either trims those variables or not. This also gives more control for someone who may want to limit the # of years but still wants all the output variables. Default it to False.

Copy link
Collaborator Author

@jtlangevin jtlangevin Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea. Here is how I've organized it in the YAML:


      change_yr_interval:
        type: ["integer", "null"]
        default: null
        description: Change reporting to every N years, where N is the integer specified in this argument.
      trim_vars:
        type: boolean
        default: false
        description: If true, a reduced set of only essential variables is reported.

scout/run.py Outdated
Comment on lines 5109 to 5116
if "every_other" in yr_interval:
n_yrs = 2
elif "every_five" in yr_interval:
n_yrs = 5
elif "every_ten" in yr_interval:
n_yrs = 10
else:
n_yrs = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If following the above suggestion, this could be removed, and n_yrs is simply the values of trim_results_interval inputted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will need to add some logic to avoid trimming with an interval > len(years)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, modified as suggested.

Comment on lines 5122 to 5125
for year in range(start_yr + 1, end_yr + 1):
# Year must be exactly divisible by desired year interval
if year % n_yrs == 0:
years.append(year)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of arbitrarily decides the first year reported after the baseline, so if n_yrs is 10, then it outputs (2025, 2030, 2040), because 2030%10==0. I would expect it to instead output (2025, 2035, 2045...). To correct this I think check should be based on the time from the start_year, for example:

Suggested change
for year in range(start_yr + 1, end_yr + 1):
# Year must be exactly divisible by desired year interval
if year % n_yrs == 0:
years.append(year)
for i, year in enumerate(range(start_yr + 1, end_yr + 1)):
# Year must be exactly divisible by desired year interval
if (i + 1) % n_yrs == 0:
years.append(year)

Copy link
Collaborator Author

@jtlangevin jtlangevin Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current implementation is nice b/c the user can easily encapsulate years that would be of interest for target setting, e.g., if we want to explore 2030 and 2050, just set an interval of every 10, if we want 2030, 2035, and 2050, do every five, etc. I worry that if we make this change, we start to break easy alignment with those potential target years of interest, which it's important this feature supports.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I don't think the description of change_yr_interval is entirely accurate in that case, or at least not descriptive enough. Maybe add detail that we output all years divisible by the interval.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below for a possible issue with this implementation. At the risk of adding too many related arguments, we could also have an output_years argument, that takes a list of specific years. So one could output an interval which is based off of the first year, but then ensure specific years are also output through this new argument, e.g., change_yr_interval=2, ouput_years=[2030, 2035] results in (2025, 2027, 2029, 2030, 2031, 2033, 2035...)

scout/run.py Outdated
Comment on lines 5181 to 5184
if opts.trim_results == "all_yrs":
trim_yrs = False
else:
trim_yrs = gen_trim_yrs(opts.trim_results, handyvars.aeo_years)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if opts.trim_results == "all_yrs":
trim_yrs = False
else:
trim_yrs = gen_trim_yrs(opts.trim_results, handyvars.aeo_years)
if opts.trim_results_interval:
trim_yrs = gen_trim_yrs(opts.trim_results_interval, handyvars.aeo_years)
else:
trim_yrs = False

Copy link
Collaborator Author

@jtlangevin jtlangevin Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up in a similar place:


    # User desires trimmed down variable reporting
    if opts.trim_vars:
        trim_out = True
    else:
        trim_out = False
    # User desires trimmed down year intervals
    if opts.change_yr_interval not in [None, 1]:
        # Check length of AEO years
        aeo_len = len(handyvars.aeo_years)
        # Ensure length of year reporting interval doesnt exceed the full length of time horizon
        if opts.change_yr_interval > aeo_len:
            opts.change_yr_interval = aeo_len
            # Notify user of the change
            warnings.warn(
                "'trim_yrs' user option exceeds length of time horizon. Resetting to the "
                "time horizon length of " + str(aeo_len) + " years")
        trim_yrs = gen_trim_yrs(opts.change_yr_interval, handyvars.aeo_years)
    else:
        trim_yrs = False

scout/run.py Outdated
Comment on lines 5179 to 5180
if opts.trim_results:
trim_out = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if opts.trim_results:
trim_out = True
if opts.trim_results:
trim_out = True
else:
trim_out = False

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment. Changed trim_results to trim_vars for clarity.

scout/run.py Outdated
Comment on lines 5185 to 5186
else:
trim_out, trim_yrs = (False for n in range(2))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else:
trim_out, trim_yrs = (False for n in range(2))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above for implementation.

* trim_results is now two arguments, one that trims down variables and another that trims down years in reporting.
@jtlangevin jtlangevin requested a review from aspeake July 14, 2025 18:34
aeo_len = len(handyvars.aeo_years)
# Ensure length of year reporting interval doesnt exceed the full length of time horizon
if opts.change_yr_interval > aeo_len:
opts.change_yr_interval = aeo_len
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the desired behavior here? If selecting years based on if year % yr_interval == 0:, then setting to aeo_len could output 2025 and 2028 (divisible by 26), which is probably not what anyone wants. should you instead output just the first year and the last year? That could be done by not calling gen_trim_yrs if this is true, and instead just setting trim_yrs directly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

trim_out option can't be used with reduced year set in batch

2 participants