Skip to content

Conversation

@rburghol
Copy link
Contributor

@rburghol rburghol commented Jan 9, 2026

NOT READY TO MERGE (Reason: failed tests)

This PR includes an overhaul to the model state system, leveraging a new object-oriented version of state, which accomplishes the following:

  • It streamlines the number of arguments passed to functions, so now, the arg state takes the place of state_ix, dict_ix, ts_ix, ...
  • Declarations of the base containers for state attributes is moved outside of the function, living directly in the state.py file, which allows these to be cached, resulting in a huge improvement in performance (recent branches had seen start-up time double due to less optimal numba caching).
  • An example of using dynamic equations has been added in tests/testcbp/HSP2results/ to test a 500 equation addon simulation (Use: cp PL3_5250_0001.json.manyeq PL3_5250_0001.json to test).
  • Some cleanup of arguments in function like hydr() and sedtrn() to eliminate the argument io_manager as this is no longer used (I think this has been due for a while, I don't think it is due to any changes that I recall making to the code but it seems like a cleaner approach)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's revert this and have cmd_regression do it's own overrides on a subclass of the .convert.regression_base

this file and test class is meant exclusively to work with the test directory, and doesn't need to handle things like tcodes, activities, or operations on init. it also does not need to print that it's running, since this is meant to only be run by pytest with pre-fixed input directories.

import numpy as np
from convert.regression_base import RegressTest

case = "test10specl"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is meant to be an example, can we wrap this logic into a function so that people can use it on their own directories? note that the test suite already handles these two cases, and this is purely for debugging. It doesn't test the library in any new way that is different from the test suite.

case = "test10specl"
base_case = "test10"
#case = "testcbp"
tdir = "/opt/model/HSPsquared/tests"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my test directory is different that this, can we avoid hard coding these paths?

print("hsp2", base_case, np.quantile(rchres_hydr_hsp2_base, [0,0.25,0.5,0.75,1.0]))
print("hspf", base_case, np.quantile(rchres_hydr_hspf_base, [0,0.25,0.5,0.75,1.0]))
# Monthly mean value comparisons
rchres_hydr_hsp2_test_mo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do lines 40-43 do? can these be removed?

# Example: (True, 8.7855425)

# Compare inflows and outflows
np.mean(perlnd_pwat_hsp2_test_table['PERO']) * 6000 * 0.0833
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lines 52-57 do a lot of work that is never stored to a variable or checked against a ground truth. Is this deliberate or can these lines be removed?


# now do a comparison
# HYDR diff should be almost nonexistent
test.check_con(params = ('RCHRES', 'HYDR', '001', 'ROVOL', 2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this check already covered by the test suite? I think the suite already checks all of the params by default. Also, since you're not storing the return value, this file doesn't know if the comparison actually worked. each line 61-67 doesn't do any actual checking, since the return value from check_con is never stored/checked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend against merging this file into the test suite. It looks like it was useful/necessary to support a now-compete debugging exercise but it doesn't do any new tests and it's structure is confusing since everything is a global variable that executes 'on import' rather than just when run as main. I think this is intended to be invoked via python cmd_regression.py but the typical means of making a portable python script is to have a main function that is called with a script guard like if __name__ == "__main__"
that pattern would let folks run these checks either from within the running interpreter OR as a python script -- either way the contents are redundant with a single call to pytest since these test cases are already completely covered.

If this functionality needs to be preserved to help developers in the future, perhaps we can repurpose it as an example (with a lot more documentation about what this is and how to use it to help with other future work) or a helpful python notebook tutorial.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this file do or demonstrate? It doesn't perform any checks, is it purely to isolate runtime issues when trying pandas >3.0? if so, let's write an actual test that includes checks that can pass/fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it has been added to the examples by mistake.

@austinorr
Copy link
Contributor

@rburghol this PR is very impressive -- is there any way to make it more reviewable? I think it would be a mistake for this project to merge > 362 commits in a single PR, especially when many are just commented as 'debug'. One path forward is to split this up into multiple PR's and squash the commits on your side into manageable and describable units of work.

If I were tasked with this, I'd likely try this workflow:

  • create a test env with python 3.12 and pandas <3.0
  • create a new clean local branch based on the repo's current 'develop'
  • checkout all the changes to the src/hsp2/hsp2 & src/hsp2/state directories from this PR (not the commits, just the changes)
  • run pytest -k test10 on your computer to see if anything else is needed. pull in the minimal necessary changes from this PR to get those tests to pass.
  • then commit 1 time with a 'develop state class' commit, and push it back up as a new/different PR. Try not to modify the test directory with the 'state class' PR. Let's modify the test directory with a separate PR with a separate review.
  • same logic for enhancements to the examples directory, tests directory, and the top level project directory. Let's hit each of those important features and issues with minimal clean PRs.

We should be reviewing PRs that have very few commits, each with meaningful comments, and each handling only one issue/feature. This looks like it's the full raw history of the @rburghol development exercise that has spanned many months, and has covered a ton of important and useful ground, but it's hard to untangle. Can you keep all this work in your own local separate branch (so you don't lose code) and re-submit another PR with targeted changes that address single features (like the updates to state class) or single issues (like prep for pandas 3.0)?

working this way makes it easier for us to review/admit some changes while holding back others for discussions -- currently all your progress might be blocked if we wanted to discuss some of the choices for handling certain cases with the larger group. If these were 4-5 separate PRs then we could merge in your state class changes while not needing to wait to discuss edits to the test harness.

@rburghol
Copy link
Contributor Author

rburghol commented Jan 31, 2026

Hey @austinorr this is super useful feedback/concepts. I am all for making a more readable sequence of commits.

So, to "squash" are you saying that I:

  1. Clone a fresh repo in a new dir -- let's call it newpo with a fresh branch based off develop
  2. Use cp to copy relevant files from develop-state-class to newpo (as opposed to using checkout src/hap2/... )
  3. Make a new PR from newpo.

And that by doing so all the intermediate commits will nit appear?

This sounds ideal if I understand correctly.

@austinorr
Copy link
Contributor

@rburghol you would not need to re-clone the repo, simply ensure your current branch workspace is clean (i.e,. no un-committed changes) and then do the following:

checkout the upstream develop branch. find it with git branch -r it should look something like 'upstream/develop' or something depending on what you named the upstream remote repo.

switch to the upstream develop branch, like so git checkout -b your-new-branch-name upstream/develop

then checkout the updates from your develop-state-class branch like so:
git checkout develop-state-class -- src/hsp2/hsp2
git checkout develop-state-class -- src/hsp2/state

the above works for whole directories and files, repeat as needed, but keep each 'chunk' of code you pull in to one topic so that the chunks can be committed together.

when you've collected a unit of work in the new branch, commit all the changes in one descriptive commit and open a new single-topic PR.

this process is totally non-destructive, and all your work on the develop-state-class (including the significant commit history) will live on forever (or until you can part with it) in your local git repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants