-
Notifications
You must be signed in to change notification settings - Fork 20
Develop state class #202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Develop state class #202
Conversation
…ey should handle it themselves
… into develop-state-class
… into develop-state-class
…ile analysis reveals identical results vs hspf but RegressTest fails in pandas 3.0.0 with pandas.errors.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
Develop hdf5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's revert this and have cmd_regression do it's own overrides on a subclass of the .convert.regression_base
this file and test class is meant exclusively to work with the test directory, and doesn't need to handle things like tcodes, activities, or operations on init. it also does not need to print that it's running, since this is meant to only be run by pytest with pre-fixed input directories.
| import numpy as np | ||
| from convert.regression_base import RegressTest | ||
|
|
||
| case = "test10specl" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is meant to be an example, can we wrap this logic into a function so that people can use it on their own directories? note that the test suite already handles these two cases, and this is purely for debugging. It doesn't test the library in any new way that is different from the test suite.
| case = "test10specl" | ||
| base_case = "test10" | ||
| #case = "testcbp" | ||
| tdir = "/opt/model/HSPsquared/tests" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my test directory is different that this, can we avoid hard coding these paths?
| print("hsp2", base_case, np.quantile(rchres_hydr_hsp2_base, [0,0.25,0.5,0.75,1.0])) | ||
| print("hspf", base_case, np.quantile(rchres_hydr_hspf_base, [0,0.25,0.5,0.75,1.0])) | ||
| # Monthly mean value comparisons | ||
| rchres_hydr_hsp2_test_mo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do lines 40-43 do? can these be removed?
| # Example: (True, 8.7855425) | ||
|
|
||
| # Compare inflows and outflows | ||
| np.mean(perlnd_pwat_hsp2_test_table['PERO']) * 6000 * 0.0833 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lines 52-57 do a lot of work that is never stored to a variable or checked against a ground truth. Is this deliberate or can these lines be removed?
|
|
||
| # now do a comparison | ||
| # HYDR diff should be almost nonexistent | ||
| test.check_con(params = ('RCHRES', 'HYDR', '001', 'ROVOL', 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this check already covered by the test suite? I think the suite already checks all of the params by default. Also, since you're not storing the return value, this file doesn't know if the comparison actually worked. each line 61-67 doesn't do any actual checking, since the return value from check_con is never stored/checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend against merging this file into the test suite. It looks like it was useful/necessary to support a now-compete debugging exercise but it doesn't do any new tests and it's structure is confusing since everything is a global variable that executes 'on import' rather than just when run as main. I think this is intended to be invoked via python cmd_regression.py but the typical means of making a portable python script is to have a main function that is called with a script guard like if __name__ == "__main__"
that pattern would let folks run these checks either from within the running interpreter OR as a python script -- either way the contents are redundant with a single call to pytest since these test cases are already completely covered.
If this functionality needs to be preserved to help developers in the future, perhaps we can repurpose it as an example (with a lot more documentation about what this is and how to use it to help with other future work) or a helpful python notebook tutorial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this file do or demonstrate? It doesn't perform any checks, is it purely to isolate runtime issues when trying pandas >3.0? if so, let's write an actual test that includes checks that can pass/fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it has been added to the examples by mistake.
|
@rburghol this PR is very impressive -- is there any way to make it more reviewable? I think it would be a mistake for this project to merge > 362 commits in a single PR, especially when many are just commented as 'debug'. One path forward is to split this up into multiple PR's and squash the commits on your side into manageable and describable units of work. If I were tasked with this, I'd likely try this workflow:
We should be reviewing PRs that have very few commits, each with meaningful comments, and each handling only one issue/feature. This looks like it's the full raw history of the @rburghol development exercise that has spanned many months, and has covered a ton of important and useful ground, but it's hard to untangle. Can you keep all this work in your own local separate branch (so you don't lose code) and re-submit another PR with targeted changes that address single features (like the updates to state class) or single issues (like prep for pandas 3.0)? working this way makes it easier for us to review/admit some changes while holding back others for discussions -- currently all your progress might be blocked if we wanted to discuss some of the choices for handling certain cases with the larger group. If these were 4-5 separate PRs then we could merge in your |
|
Hey @austinorr this is super useful feedback/concepts. I am all for making a more readable sequence of commits. So, to "squash" are you saying that I:
And that by doing so all the intermediate commits will nit appear? This sounds ideal if I understand correctly. |
|
@rburghol you would not need to re-clone the repo, simply ensure your current branch workspace is clean (i.e,. no un-committed changes) and then do the following: checkout the upstream develop branch. find it with switch to the upstream develop branch, like so then checkout the updates from your develop-state-class branch like so: the above works for whole directories and files, repeat as needed, but keep each 'chunk' of code you pull in to one topic so that the chunks can be committed together. when you've collected a unit of work in the new branch, commit all the changes in one descriptive commit and open a new single-topic PR. this process is totally non-destructive, and all your work on the develop-state-class (including the significant commit history) will live on forever (or until you can part with it) in your local git repo. |
NOT READY TO MERGE (Reason: failed tests)
This PR includes an overhaul to the model
statesystem, leveraging a new object-oriented version ofstate, which accomplishes the following:statetakes the place ofstate_ix,dict_ix,ts_ix, ...tests/testcbp/HSP2results/to test a 500 equation addon simulation (Use:cp PL3_5250_0001.json.manyeq PL3_5250_0001.jsonto test).hydr()andsedtrn()to eliminate the argumentio_manageras this is no longer used (I think this has been due for a while, I don't think it is due to any changes that I recall making to the code but it seems like a cleaner approach)