-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move CLMBuildNamelist.pm perl code into the python buildnml #585
Comments
I'm a fan of doing as much as possible incrementally, rather than via a giant big bang rewrite. To this end: I wonder if it would be possible to incrementally migrate the perl to python. I'm imagining:
In addition to letting us make the changes and test them incrementally, this strategy would also make it possible for new namelist groups to be added directly in the python even before the entire conversion is done, thus making for less overall work involved with the conversion. |
@ekluzek this might be a good thing for us to pair program, at least for portions of it. So I'm also assigning myself. |
This approach does make sense to me as well, as a way to make the conversion incremental, which is probably the only way we can see it happen. We originally talked about doing this months ago, and we haven't really been able to look at it (at least I haven't). |
Thinking more about the incremental strategy I laid out: One potential problem is this: It would be really good if we can leverage the NLCOMPs in the test suite to determine if anything is unexpected (while most namelist differences would cause baseline comparison failures, some, like differences in hist field lists, would not). One possible way around this would be: after concatenating the perl and python namelists, sort the namelist groups alphabetically. That way, the resulting merged list will be independent of which pieces are done in the perl vs. the python. (After we are done the conversion, we could get rid of this sort, so that we go back to the old order.) |
I was thinking a bit about the build-namelist unit tester ( I find it easier to work with tests when each test is self-contained (e.g., in its own function, as is the recommended practice for unit tests). There can (and should) still be shared functions that are called by multiple individual tests, but you should be able to read through a single (short) test function and see everything that's done to set up that test. ("Oh, this test uses the standard arguments and then adds the one argument |
I was thinking a bit about how to proceed here, given that (1) @ekluzek 's plate keeps getting filled up with short-term needs, which doesn't let him get to long-term projects like this, and (2) we're waiting to extract many parameters until this pythonization is done (and in the meantime, the parameter netcdf file is in a non-ideal state in terms of both usability and maintainability). A strategy I thought of, tied in with the idea of doing this incrementally (having both a perl and python namelist side-by-side for a while is): What if we start by tackling the parameters that are currently in the netcdf file? So the first step would be to make the netcdf input file generated on the fly, allowing users to set values there via the user_nl mechanism. To do this, we would need to do the following:
Once the existing parameter file has been migrated over in this way, we can start extracting new parameters using the new python-based infrastructure. |
From today's CSEG meeting:
The above benefit could be very helpful for us in CTSM's namelist generation, too. We plan to have some discussion in CSEG about which way we want to go long-term - yaml or xml - and then we should make CTSM's python-based namelist generation consistent with that decision. If we do stick with xml, I wonder if we can come up with a way to accomplish the same arbitrary-python-expression functionality in xml. I'm thinking we could have an attribute with a special name like "python_condition"; when this is encountered, it is evaluated, and we determine if it evaluates to true. From looking briefly at cime's entry_id.py:_get_value_match function, I think we could accomplish this via modifying or writing a completely different version of that function (though I'm not sure off-hand how to feed in information on all of the appropriate variables that should provide context for the evaluation). |
Blocks #454 |
One thing I'd like us to be careful about: We have a number of namelist variables whose defaults depend on the values of other namelist variables - e.g., fire parameters whose defaults depend on the fire method being used. It would be easy to accidentally have these parsed in the wrong order; we should have some error checking in place to ensure that doesn't happen. Alternatively, we could have a rule that namelist defaults should only depend on xml variables, not on other namelist variables. This would require changing some namelist variables into xml variables. I used to argue that we should do this for conceptual clarity (for both users and developers), but I'm starting to feel like (1) that may involve too significant of a change, and (2) it could be harder to maintain with lilac, since lilac doesn't use the same xml variable mechanism. A compromise could be to move these namelist variables to a special namelist group, like "high_level_settings". Those would be parsed first, and we could have a rule, somehow enforced by the code, that other variables' defaults can only depend on namelist variables in that group. |
Is this also worth discussing in the context of the PPE? Maybe in the
short term we don't need to worry about this, but the dual nature of
namelist vs. parameter file parameters is confusing for users (including
me).
…On Tue, Sep 29, 2020 at 8:49 PM Bill Sacks ***@***.***> wrote:
One thing I'd like us to be careful about: We have a number of namelist
variables whose defaults depend on the values of other namelist variables -
e.g., fire parameters whose defaults depend on the fire method being used.
It would be easy to accidentally have these parsed in the wrong order; we
should have some error checking in place to ensure that doesn't happen.
Alternatively, we could have a rule that namelist defaults should only
depend on xml variables, not on other namelist variables. This would
require changing some namelist variables into xml variables. I used to
argue that we should do this for conceptual clarity (for both users and
developers), but I'm starting to feel like (1) that may involve too
significant of a change, and (2) it could be harder to maintain with lilac,
since lilac doesn't use the same xml variable mechanism.
A compromise could be to move these namelist variables to a special
namelist group, like "high_level_settings". Those would be parsed first,
and we could have a rule, somehow enforced by the code, that other
variables' defaults can only depend on namelist variables in that group.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#585 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB5IWJFM2M72NA4PCXPF4PDSIKMD7ANCNFSM4GH56ZLA>
.
|
One of the important reasons for rewriting build-namelist in python is to support our plan for generating the netcdf params file on the fly, and moving all numeric parameters to the params file. Under this plan, all numeric parameters will be modifiable in two ways: (1) using user_nl_clm, or (2) after the params file is generated, using tools to manipulate the netcdf file and pointing to your new file. Does that address your question? |
Yep, thanks Bill
…On Wed, Sep 30, 2020 at 5:32 AM Bill Sacks ***@***.***> wrote:
Is this also worth discussing in the context of the PPE? Maybe in the
short term we don't need to worry about this, but the dual nature of
namelist vs. parameter file parameters is confusing for users (including
me).
One of the important reasons for rewriting build-namelist in python is to
support our plan for generating the netcdf params file on the fly, and
moving all numeric parameters to the params file. Under this plan, all
numeric parameters will be modifiable in two ways: (1) using user_nl_clm,
or (2) after the params file is generated, using tools to manipulate the
netcdf file and pointing to your new file.
Does that address your question?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#585 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB5IWJESJ22ZFXPI7ZDLSZTSIMJLPANCNFSM4GH56ZLA>
.
|
In terms of the PPE work. I think this project is too big to get done before simulations need to be started. And once you have them started I don't think you want to change mid-stream to a different mechanism. But, yes one of the end goals of this project is to remove the divide between namelist and params file settings for users. so it should be easier to work with. |
Has there been any activity on this task? |
No, there hasn't :-( I'll open an issue for glob vs. bsd_glob - thanks for bringing that to our attention. |
@jedwards4b I updated #1550 to point out that this issue was fixed previously. I think the version in /work/02503/edwardsj/cesm2_x.ihesp/components/clm must be an older clm before this was fixed. The fix came in Feb/2020. Note, there are some uses of glob in cime under utils/perl5lib though. |
Some recent developments make me want to bring this up again. Two things to note is param-dev and recent work on SLIM. Alper's param-dev work has gone into the latest CAM, which means it's available in cime and we could take advantage of it as well. In SLIM I have a PR that replaces the perl based build-namelist for SLIM with a simple python based one. That work would make it easier for me to do this replacement. |
#2126 and some work I and @adrifoster was doing in the perl build-namelist "unit-tester" (it's not really a unit-tester, as it's not testing methods inside the perl, it's really a system tester of build-namelist) got me thinking about this again. I was originally thinking that we should create a python script that just replicates the behavior of the perl build-namelist and then the perl "unit-tester" could still be used. However, I'm realizing that the perl "unit-tester" should also be replaced. And we probably don't want the python version to act the same way as the perl version. But, an incremental way to go about this would be to do the following:
It's steps 5 and 6 that I'm not sure about. Because doing it incrementally might maximize confusion on how to handle namelist work. But, doing it in one big go, is what prevents us from actually getting to this. |
Perhaps we could move incrementally and have a script that calls both?
In particular it would be great to be able to call individual tests - which
I think is what you are planning?
…On Wed, Sep 13, 2023 at 7:49 AM Erik Kluzek ***@***.***> wrote:
#2126 <#2126> and some work I and
@adrifoster <https://github.com/adrifoster> was doing in the perl
build-namelist "unit-tester" (it's not really a unit-tester, as it's not
testing methods inside the perl, it's really a system tester of
build-namelist) got me thinking about this again.
I was originally thinking that we should create a python script that just
replicates the behavior of the perl build-namelist and then the perl
"unit-tester" could still be used. However, I'm realizing that the perl
"unit-tester" should also be replaced. And we probably don't want the
python version to act the same way as the perl version.
But, an incremental way to go about this would be to do the following:
1. Move the buildnml into a module that can be unit and system tested
under the cime_config directory (this would be similar to what I did in
SLIM)
2. Have the python buildnml be able to be tested with the mechanisms
under the python directory (this again is what I did in SLIM). This means
we also have it tested with pylint (we already have it tested with black)
3. New functionality (such as in #2126
<#2126>) would be added into the
buildnml python portion rather than the perl build-namelist part.
4. Some targeted functionality from the perl could be moved over into
the python buildnml part. (we would examine what could be done here without
making the division between the perl and python parts too confusing)
5. Move more parts incrementally? Or should it be the next option?
6. Move everything in one big go?
It's steps 5 and 6 that I'm not sure about. Because doing it incrementally
might maximize confusion on how to handle namelist work. But, doing it in
one big go, is what prevents us from actually getting to this.
—
Reply to this email directly, view it on GitHub
<#585 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADE42IXX42HEMURL3ZYNOADX2G2WHANCNFSM4GH56ZLA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I feel we should do this incrementally; otherwise I fear it will never happen, and this is a big enough effort that trying to do it all at once will lead to the pain of trying to rewrite a moving target. I realize there will be some short-term confusion, but I think it's worth it. See #585 (comment) and other notes above referencing "incremental". |
@adrifoster that is pretty much what @bilsacks proposed above in his comment (that now he responded to). I'm not sure what you mean about "individual tests"? Are you saying the ability to ask for which specific tests to run -- similar to what we do in create_test? I was really thinking of doing this the same way that we do our unit and system tests in the python directory. But, for the buildnml system tests I think you are right we should develop a system where you specify XML settings for the system tests, so you could cook up any kind of test that is desired. Maybe we then do something like testmods to bundle different types of system tests? I think I like that. That was definitely a problem with the perl testing it really needed a way to have potential tests done in a general way, or have a list of tests that you could do.... I thought I recalled that @billsacks decided later that would be bad to do as it would maximize confusion on how to handle namelist things. But, from his response above it sounds like he agrees since he's right that it won't happen unless we do it incrementally. Although maybe that's a feature rather than a bug? Because if we get far enough in the "we have this both in python and perl world" -- the pain of being there might incentivize us to move solely into the python space? And we would pay the cost of getting it done in one big effort? |
I guess I am thinking about yesterday when we had to run the whole test script over and over again (which takes a while) just to investigate what was going on with two very small tests. So if there was a way you could only call those tests that would be really helpful. Even if it was in some kind of interactive use of python? |
We want to rewrite CLMBuildNamelist.pm in python. This is partly necessary in order to support our long-term goal of having parameters read from a netcdf file rather than a fortran namelist. But it's also important in order to take advantage of new features in the cime python namelist generation utilities.
The text was updated successfully, but these errors were encountered: