Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when running cam_dev with SILHS #844

Open
bstephens82 opened this issue Jun 21, 2023 · 5 comments
Open

Crash when running cam_dev with SILHS #844

bstephens82 opened this issue Jun 21, 2023 · 5 comments
Labels
bug Something isn't working correctly CoupledEval4 -wish list

Comments

@bstephens82
Copy link
Collaborator

bstephens82 commented Jun 21, 2023

What happened?

CAM crashes when running -silhs with -phys cam_dev. The immediate error is coming from subcol.F90, in the subcol_field_avg interface, from the subcol_field_avg_2dr subroutine. Traceback indicates that the problem originates from cam_dev/micro_pumas_cam.F90, specifically where we are trying to call subcol_field_avg on the proc_rates derived type. This derived type is not present in cam/micro_pumas_cam.F90, which I think explains why this isn't a problem in -phys cam6 runs. I think the problem is that when proc_rates is allocated, it's allocated with ncol in the first slot, but the averaging subroutine requires that the first slot have dimension pcols*psubcols. Not being familiar with the code, it's not obvious to me what the right solution to this is. I tried simply allocating proc_rates with psetcols, but this created a bunch of NaNs in CLUBB and the model crashed.

What are the steps to reproduce the bug?

Set up a new case, and set CAM_CONFIG_OPTS="-phys cam_dev -silhs" I think is all it should take.

What CAM tag were you using?

cam6_3_114

What machine were you running CAM on?

CISL machine (e.g. cheyenne)

What compiler were you using?

Intel

Path to a case directory, if applicable

/glade/scratch/stepheba/silhstest_camdev

Will you be addressing this bug yourself?

Any CAM SE can do this

Extra info

No response

@bstephens82 bstephens82 added the bug Something isn't working correctly label Jun 21, 2023
@cacraigucar
Copy link
Collaborator

@bstephens82 - Do you know if this is still a problem?

@bstephens82
Copy link
Collaborator Author

@bstephens82 - Do you know if this is still a problem?

I would be surprised if not...it seemed like something that would require some effort to resolve, and I have not worked on it myself. Should I check to see if it's still an issue?

@bstephens82
Copy link
Collaborator Author

bstephens82 commented Apr 19, 2024

@cacraigucar I know it's been a while haha, but I was thinking about this issue recently and gave it a shot today and I find the same problem when running SILHS with cam_dev on the cam6_3_155 tag, just FYI. I'd be happy to take another look at it and see if I can think of a fix for it, when I have some time here and there.

@bstephens82
Copy link
Collaborator Author

bstephens82 commented Apr 22, 2024

I think I understand a bit more about the problem now. The averaging subroutine linked in the first post of this issue requires that the incoming array have first dimension equal to pcols*psubcols, and CAM crashes if it's not equal. At the top of the micro_pumas_cam_tend subroutine in cam_dev/micro_pumas_cam.F90, we find that the proc_rates derived type (which I am guessing may have been introduced with cam_dev since it doesn't appear in the cam6 version, but maybe it was introduced later) is allocated as

   call proc_rates%allocate(ncol, nlev, ncd, micro_mg_warm_rain, errstring)

where ncol = state%ncol. Now, ncol is usually equal to pcols*psubcols, at least in this subroutine, but because ncol is not fixed over the course of a CAM test, there can be times when it is not equal to pcols*psubcols, hence the fields contained within the proc_rates derived type are not always compatible with the subcolumn averaging subroutines. I will continue looking at the code to see if there's a nice and neat way to resolve this.

@cacraigucar
Copy link
Collaborator

cacraigucar commented Apr 22, 2024

I think I understand a bit more about the problem now. The averaging subroutine linked in the first post of this issue requires that the incoming array have first dimension equal to pcols*psubcols, and CAM crashes if it's not equal. At the top of the micro_pumas_cam_tend subroutine in cam_dev/micro_pumas_cam.F90, we find that the proc_rates derived type (which I am guessing may have been introduced with cam_dev since it doesn't appear in the cam6 version, but maybe it was introduced later) is allocated as

   call proc_rates%allocate(ncol, nlev, ncd, micro_mg_warm_rain, errstring)

where ncol = state%ncol. Now, ncol is usually equal to pcols*psubcols, at least in this subroutine, but because ncol is not fixed over the course of a CAM test, there can be times when it is not equal to pcols*psubcols, hence the fields contained within the proc_rates derived type are not always compatible with the subcolumn averaging subroutines. I will continue looking at the code to see if there's a nice and neat way to resolve this.

I can say with absolute certainty that the allocation step is correct, and that the problem is in how the SILHS code handles this new structure. I believe SILHS was never tested with this new DDT structure and hence it is not enabled with it.

This feature of a PUMAS DDT was brought in in #632. It was only brought in for mg3 and not for earlier versions, which is why the /cam/micro_pumas_cam.F90 was not updated. There is extensive discussion in this PR and the associated issues on this new feature.

The problem was for the DDT, the elements needed to be exactly state%ncol when they are used in PUMAS, but in some cases map into the CAM model's psubcols*pcols structure. This took careful bookkeeping in the micro_pumas_cam.F90 code.

It is quite possible that one or more SILHS routines need to have their own version in /cam_dev and the changes that were made to micro_pumas_cam.F90 be made in the SILHS code. I would suggest that you look at the changes made to cam_dev/micro_pumas_cam.F90 in PR #632 and determine what corresponding changes need to be made to cam_dev/SILHS code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctly CoupledEval4 -wish list
Projects
Status: To Do
Development

No branches or pull requests

3 participants