Move CLMBuildNamelist.pm perl code into the python buildnml #585

billsacks · 2018-12-03T21:54:38Z

We want to rewrite CLMBuildNamelist.pm in python. This is partly necessary in order to support our long-term goal of having parameters read from a netcdf file rather than a fortran namelist. But it's also important in order to take advantage of new features in the cime python namelist generation utilities.

billsacks · 2018-12-03T21:59:18Z

I'm a fan of doing as much as possible incrementally, rather than via a giant big bang rewrite. To this end: I wonder if it would be possible to incrementally migrate the perl to python. I'm imagining:

The top-level build-namelist script would call the perl CLMBuildNamelist.pm and then the python CLMBuildNamelist.py. Each would create its own subset of the namelist. Then the top-level script would concatenate the two together. (For this to work easily, I think we'd need to migrate entire namelist groups all at once.)
As we incrementally move things from the perl to the python, we would also move the relevant variables out of the old namelist_definition / namelist_defaults files into a new file (which would follow the new convention of having the definition and defaults together in a single file). In the end, we could delete the now-empty old files, leaving just the new file.

In addition to letting us make the changes and test them incrementally, this strategy would also make it possible for new namelist groups to be added directly in the python even before the entire conversion is done, thus making for less overall work involved with the conversion.

billsacks · 2018-12-04T20:53:28Z

@ekluzek this might be a good thing for us to pair program, at least for portions of it. So I'm also assigning myself.

ekluzek · 2018-12-04T20:55:53Z

This approach does make sense to me as well, as a way to make the conversion incremental, which is probably the only way we can see it happen. We originally talked about doing this months ago, and we haven't really been able to look at it (at least I haven't).

billsacks · 2018-12-04T20:58:10Z

Thinking more about the incremental strategy I laid out: One potential problem is this: It would be really good if we can leverage the NLCOMPs in the test suite to determine if anything is unexpected (while most namelist differences would cause baseline comparison failures, some, like differences in hist field lists, would not). One possible way around this would be: after concatenating the perl and python namelists, sort the namelist groups alphabetically. That way, the resulting merged list will be independent of which pieces are done in the perl vs. the python. (After we are done the conversion, we could get rid of this sort, so that we go back to the old order.)

billsacks · 2019-04-16T16:02:37Z

I was thinking a bit about the build-namelist unit tester (build-namelist_test.pl), and how we can make this easier to work with when we move to python. I think that part of what makes the current script hard to work with is that it's mostly one long function with a bunch of different tests. This makes it hard to tell which parts of the script apply to all tests, which apply to some tests, and which just apply to a single test.

I find it easier to work with tests when each test is self-contained (e.g., in its own function, as is the recommended practice for unit tests). There can (and should) still be shared functions that are called by multiple individual tests, but you should be able to read through a single (short) test function and see everything that's done to set up that test. ("Oh, this test uses the standard arguments and then adds the one argument --foo.")

billsacks · 2019-07-08T16:58:50Z

I was thinking a bit about how to proceed here, given that (1) @ekluzek 's plate keeps getting filled up with short-term needs, which doesn't let him get to long-term projects like this, and (2) we're waiting to extract many parameters until this pythonization is done (and in the meantime, the parameter netcdf file is in a non-ideal state in terms of both usability and maintainability).

A strategy I thought of, tied in with the idea of doing this incrementally (having both a perl and python namelist side-by-side for a while is): What if we start by tackling the parameters that are currently in the netcdf file? So the first step would be to make the netcdf input file generated on the fly, allowing users to set values there via the user_nl mechanism. To do this, we would need to do the following:

Add python infrastructure in cime that allows generating a netcdf file rather than a Fortran namelist
Introduce a new xml file that gives default values of these parameters, in the new style (where the definition and defaults are combined in a single file)
Add some python code in ctsm that leverages this new cime infrastructure, generating a netcdf file with all of the parameters given in the new xml file, plus a separate namelist file that for now just contains the file name of the netcdf parameter file (eventually, once we have an all-python build-namelist, this will be one entry in the single namelist file). (Hopefully the ctsm-specific python code will be minimal for these numeric parameters: most or all required information should be in the xml file. Once we start extracting flags for which we need consistency checks, special handling, etc., we'll need more logic in the python, but for these numeric parameters, I don't imagine we'll need much special handling.)
Ensure that both the perl and python build-namelists are called. If possible, the python should be called as a subroutine, not as a subprocess. So cime's preview_namelists call would call ctsm's python build-namelist, which would (a) do the python-based namelist / netcdf generation, and (b) invoke the legacy perl-based namelist generation as a subprocess.
Change the Fortran code a bit to read the parameter file name out of this new (temporary) namelist file. (Other than that, few if any changes should be needed in the Fortran code.)

Once the existing parameter file has been migrated over in this way, we can start extracting new parameters using the new python-based infrastructure.

billsacks · 2019-10-08T21:21:50Z

From today's CSEG meeting:

Alper's presentation: Staging MOM6 runtime parameter input files for CESM cases

One interesting point from this is that he uses yaml rather than xml to store namelist defaults (currently, the developer runs a script to convert this yaml to json; this won't be needed once cime allows yaml). A huge benefit of what Alper has done is that he supports arbitrary python expressions as conditionals (which he then evaluates).

We should have some more discussions about whether we'd like to migrate to something like this vs. keeping with the current xml-based namelist generation. Mariana says that she and Jim aren't very happy with the current implementation (namelist.py and nmlgen.py), so she could see migrating to something new / better.

The above benefit could be very helpful for us in CTSM's namelist generation, too. We plan to have some discussion in CSEG about which way we want to go long-term - yaml or xml - and then we should make CTSM's python-based namelist generation consistent with that decision.

If we do stick with xml, I wonder if we can come up with a way to accomplish the same arbitrary-python-expression functionality in xml. I'm thinking we could have an attribute with a special name like "python_condition"; when this is encountered, it is evaluated, and we determine if it evaluates to true. From looking briefly at cime's entry_id.py:_get_value_match function, I think we could accomplish this via modifying or writing a completely different version of that function (though I'm not sure off-hand how to feed in information on all of the appropriate variables that should provide context for the evaluation).

billsacks · 2020-01-14T16:50:33Z

Blocks #454

billsacks · 2020-09-30T02:49:22Z

One thing I'd like us to be careful about: We have a number of namelist variables whose defaults depend on the values of other namelist variables - e.g., fire parameters whose defaults depend on the fire method being used. It would be easy to accidentally have these parsed in the wrong order; we should have some error checking in place to ensure that doesn't happen.

Alternatively, we could have a rule that namelist defaults should only depend on xml variables, not on other namelist variables. This would require changing some namelist variables into xml variables. I used to argue that we should do this for conceptual clarity (for both users and developers), but I'm starting to feel like (1) that may involve too significant of a change, and (2) it could be harder to maintain with lilac, since lilac doesn't use the same xml variable mechanism.

A compromise could be to move these namelist variables to a special namelist group, like "high_level_settings". Those would be parsed first, and we could have a rule, somehow enforced by the code, that other variables' defaults can only depend on namelist variables in that group.

wwieder · 2020-09-30T03:47:49Z

Is this also worth discussing in the context of the PPE? Maybe in the short term we don't need to worry about this, but the dual nature of namelist vs. parameter file parameters is confusing for users (including me).

…

On Tue, Sep 29, 2020 at 8:49 PM Bill Sacks ***@***.***> wrote: One thing I'd like us to be careful about: We have a number of namelist variables whose defaults depend on the values of other namelist variables - e.g., fire parameters whose defaults depend on the fire method being used. It would be easy to accidentally have these parsed in the wrong order; we should have some error checking in place to ensure that doesn't happen. Alternatively, we could have a rule that namelist defaults should only depend on xml variables, not on other namelist variables. This would require changing some namelist variables into xml variables. I used to argue that we should do this for conceptual clarity (for both users and developers), but I'm starting to feel like (1) that may involve too significant of a change, and (2) it could be harder to maintain with lilac, since lilac doesn't use the same xml variable mechanism. A compromise could be to move these namelist variables to a special namelist group, like "high_level_settings". Those would be parsed first, and we could have a rule, somehow enforced by the code, that other variables' defaults can only depend on namelist variables in that group. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#585 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB5IWJFM2M72NA4PCXPF4PDSIKMD7ANCNFSM4GH56ZLA> .

billsacks · 2020-09-30T11:31:52Z

Is this also worth discussing in the context of the PPE? Maybe in the short term we don't need to worry about this, but the dual nature of namelist vs. parameter file parameters is confusing for users (including me).

One of the important reasons for rewriting build-namelist in python is to support our plan for generating the netcdf params file on the fly, and moving all numeric parameters to the params file. Under this plan, all numeric parameters will be modifiable in two ways: (1) using user_nl_clm, or (2) after the params file is generated, using tools to manipulate the netcdf file and pointing to your new file.

Does that address your question?

wwieder · 2020-09-30T12:23:29Z

Yep, thanks Bill

…

On Wed, Sep 30, 2020 at 5:32 AM Bill Sacks ***@***.***> wrote: Is this also worth discussing in the context of the PPE? Maybe in the short term we don't need to worry about this, but the dual nature of namelist vs. parameter file parameters is confusing for users (including me). One of the important reasons for rewriting build-namelist in python is to support our plan for generating the netcdf params file on the fly, and moving all numeric parameters to the params file. Under this plan, all numeric parameters will be modifiable in two ways: (1) using user_nl_clm, or (2) after the params file is generated, using tools to manipulate the netcdf file and pointing to your new file. Does that address your question? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#585 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB5IWJESJ22ZFXPI7ZDLSZTSIMJLPANCNFSM4GH56ZLA> .

ekluzek · 2020-09-30T17:34:38Z

In terms of the PPE work. I think this project is too big to get done before simulations need to be started. And once you have them started I don't think you want to change mid-stream to a different mechanism. But, yes one of the end goals of this project is to remove the divide between namelist and params file settings for users. so it should be easier to work with.

jedwards4b · 2021-11-15T15:07:38Z

Has there been any activity on this task?
errput: File::Glob::glob() will disappear in perl 5.30. Use File::Glob::bsd_glob() instead. at /work/02503/edwardsj/cesm2_x.ihesp/components/clm/bld/CLMBuildNamelist.pm line 4193.

billsacks · 2021-11-15T17:37:01Z

No, there hasn't :-( I'll open an issue for glob vs. bsd_glob - thanks for bringing that to our attention.

ekluzek · 2021-12-07T22:07:55Z

@jedwards4b I updated #1550 to point out that this issue was fixed previously. I think the version in /work/02503/edwardsj/cesm2_x.ihesp/components/clm must be an older clm before this was fixed. The fix came in Feb/2020.

Note, there are some uses of glob in cime under utils/perl5lib though.

ekluzek · 2023-01-09T03:43:07Z

Some recent developments make me want to bring this up again. Two things to note is param-dev and recent work on SLIM. Alper's param-dev work has gone into the latest CAM, which means it's available in cime and we could take advantage of it as well. In SLIM I have a PR that replaces the perl based build-namelist for SLIM with a simple python based one. That work would make it easier for me to do this replacement.

ekluzek · 2023-09-13T13:49:11Z

#2126 and some work I and @adrifoster was doing in the perl build-namelist "unit-tester" (it's not really a unit-tester, as it's not testing methods inside the perl, it's really a system tester of build-namelist) got me thinking about this again.

I was originally thinking that we should create a python script that just replicates the behavior of the perl build-namelist and then the perl "unit-tester" could still be used. However, I'm realizing that the perl "unit-tester" should also be replaced. And we probably don't want the python version to act the same way as the perl version.

But, an incremental way to go about this would be to do the following:

Move the buildnml into a module that can be unit and system tested under the cime_config directory (this would be similar to what I did in SLIM)
Have the python buildnml be able to be tested with the mechanisms under the python directory (this again is what I did in SLIM). This means we also have it tested with pylint (we already have it tested with black)
New functionality (such as in Have the FATES parameter file created at runtime in the buildnml #2126) would be added into the buildnml python portion rather than the perl build-namelist part.
Some targeted functionality from the perl could be moved over into the python buildnml part. (we would examine what could be done here without making the division between the perl and python parts too confusing)
Move more parts incrementally? Or should it be the next option?
Move everything in one big go?

It's steps 5 and 6 that I'm not sure about. Because doing it incrementally might maximize confusion on how to handle namelist work. But, doing it in one big go, is what prevents us from actually getting to this.

adrifoster · 2023-09-13T13:52:08Z

Perhaps we could move incrementally and have a script that calls both? In particular it would be great to be able to call individual tests - which I think is what you are planning?

…

On Wed, Sep 13, 2023 at 7:49 AM Erik Kluzek ***@***.***> wrote: #2126 <#2126> and some work I and @adrifoster <https://github.com/adrifoster> was doing in the perl build-namelist "unit-tester" (it's not really a unit-tester, as it's not testing methods inside the perl, it's really a system tester of build-namelist) got me thinking about this again. I was originally thinking that we should create a python script that just replicates the behavior of the perl build-namelist and then the perl "unit-tester" could still be used. However, I'm realizing that the perl "unit-tester" should also be replaced. And we probably don't want the python version to act the same way as the perl version. But, an incremental way to go about this would be to do the following: 1. Move the buildnml into a module that can be unit and system tested under the cime_config directory (this would be similar to what I did in SLIM) 2. Have the python buildnml be able to be tested with the mechanisms under the python directory (this again is what I did in SLIM). This means we also have it tested with pylint (we already have it tested with black) 3. New functionality (such as in #2126 <#2126>) would be added into the buildnml python portion rather than the perl build-namelist part. 4. Some targeted functionality from the perl could be moved over into the python buildnml part. (we would examine what could be done here without making the division between the perl and python parts too confusing) 5. Move more parts incrementally? Or should it be the next option? 6. Move everything in one big go? It's steps 5 and 6 that I'm not sure about. Because doing it incrementally might maximize confusion on how to handle namelist work. But, doing it in one big go, is what prevents us from actually getting to this. — Reply to this email directly, view it on GitHub <#585 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADE42IXX42HEMURL3ZYNOADX2G2WHANCNFSM4GH56ZLA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

billsacks · 2023-09-13T14:26:03Z

I feel we should do this incrementally; otherwise I fear it will never happen, and this is a big enough effort that trying to do it all at once will lead to the pain of trying to rewrite a moving target. I realize there will be some short-term confusion, but I think it's worth it. See #585 (comment) and other notes above referencing "incremental".

ekluzek · 2023-09-13T14:40:40Z

@adrifoster that is pretty much what @bilsacks proposed above in his comment (that now he responded to). I'm not sure what you mean about "individual tests"? Are you saying the ability to ask for which specific tests to run -- similar to what we do in create_test? I was really thinking of doing this the same way that we do our unit and system tests in the python directory. But, for the buildnml system tests I think you are right we should develop a system where you specify XML settings for the system tests, so you could cook up any kind of test that is desired. Maybe we then do something like testmods to bundle different types of system tests? I think I like that. That was definitely a problem with the perl testing it really needed a way to have potential tests done in a general way, or have a list of tests that you could do....

I thought I recalled that @billsacks decided later that would be bad to do as it would maximize confusion on how to handle namelist things. But, from his response above it sounds like he agrees since he's right that it won't happen unless we do it incrementally.

Although maybe that's a feature rather than a bug? Because if we get far enough in the "we have this both in python and perl world" -- the pain of being there might incentivize us to move solely into the python space? And we would pay the cost of getting it done in one big effort?

adrifoster · 2023-09-13T14:43:56Z

@adrifoster that is pretty much what @bilsacks proposed above in his comment (that now he responded to). I'm not sure what you mean about "individual tests"?

I guess I am thinking about yesterday when we had to run the whole test script over and over again (which takes a while) just to investigate what was going on with two very small tests. So if there was a way you could only call those tests that would be really helpful. Even if it was in some kind of interactive use of python?

ekluzek · 2024-04-30T23:16:04Z

This should likely be done based on ParamGen which would help with issues: #26, #569 and #454

billsacks added the code health improving internal code structure to make easier to maintain (sustainability) label Dec 3, 2018

billsacks assigned ekluzek Dec 3, 2018

billsacks self-assigned this Dec 4, 2018

billsacks removed their assignment Jan 9, 2019

billsacks assigned negin513 Mar 4, 2019

billsacks mentioned this issue Jan 14, 2020

Allow namelist defaults to apply for phys versions greater than or less than a given version #454

Open

billsacks added the blocker another issue/PR depends on this one label Jan 14, 2020

billsacks added the size: large Large project that will take a few weeks or more label Mar 9, 2020

billsacks mentioned this issue Nov 15, 2021

In perl code replace glob with bsd_glob #1550

Closed

ekluzek added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Jan 9, 2023

ekluzek unassigned negin513 Jan 9, 2023

ekluzek mentioned this issue Jan 9, 2023

stream data input now set in new config_component.xml variables #1929

Closed

billsacks removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Mar 2, 2023

ekluzek mentioned this issue Sep 13, 2023

Have the FATES parameter file created at runtime in the buildnml #2126

Open

ekluzek mentioned this issue Sep 22, 2023

Build-namelist logic to find the finidat file is fragile and needs to be redone #2169

Open

6 tasks

ekluzek mentioned this issue Mar 14, 2024

perl script configure needs to be replaced ESCOMP/CAM#993

Open

This was referenced Apr 30, 2024

Organize non-default options into for-testing physics options #569

Open

add ability to specify namelist requirements via xml #26

Open

ekluzek changed the title ~~Rewrite CLMBuildNamelist.pm in python~~ Move CLMBuildNamelist.pm perl code into the python buildnml Apr 30, 2024

ekluzek mentioned this issue Apr 30, 2024

Fix defaults in clm6_0 physics #2492

Closed

ekluzek mentioned this issue Jul 31, 2024

Catch use of namelist attribute in failure testing in perl build-namelist tester #2673

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move CLMBuildNamelist.pm perl code into the python buildnml #585

Move CLMBuildNamelist.pm perl code into the python buildnml #585

billsacks commented Dec 3, 2018

billsacks commented Dec 3, 2018

billsacks commented Dec 4, 2018

ekluzek commented Dec 4, 2018

billsacks commented Dec 4, 2018

billsacks commented Apr 16, 2019

billsacks commented Jul 8, 2019

billsacks commented Oct 8, 2019

billsacks commented Jan 14, 2020

billsacks commented Sep 30, 2020

wwieder commented Sep 30, 2020 via email

billsacks commented Sep 30, 2020

wwieder commented Sep 30, 2020 via email

ekluzek commented Sep 30, 2020

jedwards4b commented Nov 15, 2021

billsacks commented Nov 15, 2021

ekluzek commented Dec 7, 2021

ekluzek commented Jan 9, 2023

ekluzek commented Sep 13, 2023

adrifoster commented Sep 13, 2023 via email

billsacks commented Sep 13, 2023

ekluzek commented Sep 13, 2023

adrifoster commented Sep 13, 2023

ekluzek commented Apr 30, 2024

Move CLMBuildNamelist.pm perl code into the python buildnml #585

Move CLMBuildNamelist.pm perl code into the python buildnml #585

Comments

billsacks commented Dec 3, 2018

billsacks commented Dec 3, 2018

billsacks commented Dec 4, 2018

ekluzek commented Dec 4, 2018

billsacks commented Dec 4, 2018

billsacks commented Apr 16, 2019

billsacks commented Jul 8, 2019

billsacks commented Oct 8, 2019

billsacks commented Jan 14, 2020

billsacks commented Sep 30, 2020

wwieder commented Sep 30, 2020 via email

billsacks commented Sep 30, 2020

wwieder commented Sep 30, 2020 via email

ekluzek commented Sep 30, 2020

jedwards4b commented Nov 15, 2021

billsacks commented Nov 15, 2021

ekluzek commented Dec 7, 2021

ekluzek commented Jan 9, 2023

ekluzek commented Sep 13, 2023

adrifoster commented Sep 13, 2023 via email

billsacks commented Sep 13, 2023

ekluzek commented Sep 13, 2023

adrifoster commented Sep 13, 2023

ekluzek commented Apr 30, 2024