Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to subset multiple files in a single call #40

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

MiCurry
Copy link
Contributor

@MiCurry MiCurry commented Mar 15, 2020

These commits contain a number of changes that allow the MPAS-Limited-Area program to subset multiple mesh files in a single call. The usage of create_region remains the same, except that any number of mesh files can be given.

If multiple files are given, then the first mesh file must contain the needed mesh connectivity fields. The first mesh file's connectivity fields will be used to subset the remaining files. If the first file does not contain mesh connectivity fields, then an error will occur.

This commit adds a new method to the MeshHandler that checks if a loaded mesh
contains mesh connectivity dimensions or variables. This will enable other
commits to subset multiple MPAS files from a single grid.
@MiCurry MiCurry requested a review from mgduda March 15, 2020 16:49
@MiCurry
Copy link
Contributor Author

MiCurry commented Mar 15, 2020

@mgduda, here are the changes that allow create_region to subset multiple mesh files, with the first containing mesh connectivity information. Would you be able to give it a review and run it through some tests when its convenient for you?

There are a number of commits, but most are all well self contained and small and straightforward. So hopefully you don't have much trouble understanding whats being changed, but if you do please feel free to ask.

I think the only change here that I have questions on is the file name prepending. Since I've updated the create_region_name to just prepend and not figure out the filename, if one creates a subset of an MPAS file that contains nCells (i.e. x1.10242.grid.nc, x1.10242.static.nc, etc.) the region name would become: conus.x1.10242.grid.nc, which gives the impression that conus.x1.10242.grid.nc has 10242 cells.

@ldfowler58, here is a primarily set up changes that allow MPAS-Limited-Area program to subset multiple grids and those without mesh connectivity information. If you'd like, it would great if you can give it a test and report back with any problems or suggestions, but I'd hold off on using it for any hard science before it gets fully reviewed and released.

@ldfowler58
Copy link

ldfowler58 commented Mar 15, 2020 via email

@MiCurry
Copy link
Contributor Author

MiCurry commented Mar 31, 2020

@ldfowler58 I have just pushed an updated to this branch with changes that fixed the memory problem that you were having with your meshes when trying to subset more than one mesh at a single time.

I have tested created a conus.custom.pts region with your mesh files on Cheyenne and had success. If you have the time and if its convenient, please feel free to give this branch another test. I'd greatly appreciate it!

And @mgduda did you want to give this PR a review and a test? If not, I'm happy to review it and merge it myself.

@MiCurry MiCurry marked this pull request as ready for review March 31, 2020 20:43
MiCurry added 9 commits July 30, 2020 16:13
This commit updates the argparser of the create_region script to take in
multiple files rather than just one.
Before this commit, when a new MeshHandler instance was initialized, and was
successfully loaded, MeshHandler._load_vars() was called automatically. This
automatic call needs to be removed so that meshes that do not contain
connectivity information do not try to load variables they do not contain.
Update _load_vars to be public, so it can be called from other instances.
This commit updates the LimitedArea class to receive multiple mesh files;
however, it only will subset the first mesh passed to it (assuming it has mesh
connectivity fields). This commit does not enable the ability to subset
multiple grids, but a future commit can later implement that feature.
With this commit, LimitedArea no longer tries to determine the 'type' of a mesh
file. It simply prepends the specified region name to the mesh filename.
This commit adds the ability to MeshHandler.subset_mesh to use fields of
another MeshHandler instance. This allows a mesh, that perhaps does not have
mesh connectivity information, to be subsetted by another.
This change allows the limitedArea.gen_region to subset multiple files.
The code removed in this commit was in place to try to achieve efficiency by
pre-loading the memory of the file before loading it with netCDF4 Dataset. With
the ability to subset multiple meshes in one call, this would cause a bus error
if the meshes were significantly large, as each mesh would be loaded at once,
rather than when it was needed.

At the end of `LimitedArea.gen_region()`, after the call to
`Mesh.subset_fields()`, the created mesh region file AND the mesh that was used
to created are both closed. This ensures that only one mesh (and its regional
subset) are loaded at once.
print("ERROR: Mesh file was not found", mesh_file)
sys.exit(-1)

if len(files) == 1:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add a bit more comments here, just short one liners.

sys.exit(-1)
self.meshes.append(self.mesh)

else:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a new comment here as well

else:
print("ERROR:", self.mesh.fname, "did not contain needed mesh connectivity information")
print("ERROR: The first file must contain mesh connectivity information")
sys.exit(-1)
Copy link
Contributor Author

@MiCurry MiCurry Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this error worthwhile? And is the error message correct? This if statement doesn't seem correct.

sys.exit(-1)

for mesh in files:
self.meshes.append(MeshHandler(mesh, 'r', *args, **kwargs))
Copy link
Contributor Author

@MiCurry MiCurry Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be moved to replace line 89?

@@ -80,6 +110,7 @@ def gen_region(self, *args, **kwargs):

# For each mesh, create a regional mesh and save it
print('\n')
# TODO: Update this print statement to be more consistent to what is happening
print('Creating a regional mesh of ', self.mesh.fname)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This TODO should be taken care of. Should this print statement be moved?

# Create subsets of all the meshes in self.meshes
print('Subsetting meshes...')
for mesh in self.meshes:
print("\nSubsetting:", mesh.fname)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change this print message to say print("Subsetting:", mesh.fname, "using", mesh.fname).

self.mesh.copy_global_attributes(regionalMesh)
# Create subsets of all the meshes in self.meshes
print('Subsetting meshes...')
for mesh in self.meshes:
Copy link
Contributor Author

@MiCurry MiCurry Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop is a bit confusing. mesh from the loop is confusing my brain with self.mesh (the file used to subset self.meshes).

bdyMaskVertex,
self.INSIDE,
self.UNMARKED,
self.mesh,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add a note that self.mesh is being used to subset mesh.

**kwargs)
print("Copying global attributes... ")
regionalMesh.copy_global_attributes(self.mesh)
print("Create a regional mesh:", regionFname)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create -> Created

glbBdyVertexIDs = self.indexToVertexIDs[np.where(bdyMaskVertex != unmarked)] - 1
glbBdyCellIDs = mesh.indexToCellIDs[np.where(bdyMaskCell != unmarked)] - 1
glbBdyEdgeIDs = mesh.indexToEdgeIDs[np.where(bdyMaskEdge != unmarked)] - 1
glbBdyVertexIDs = mesh.indexToVertexIDs[np.where(bdyMaskVertex != unmarked)] - 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix this white space, they aren't necessary

region.mesh.createVariable('bdyMaskCell', 'i4', ('nCells',))
region.mesh.createVariable('bdyMaskEdge', 'i4', ('nEdges',))
region.mesh.createVariable('bdyMaskVertex', 'i4', ('nVertices'))
if self.check_grid():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this check necessary? Hasn't the MeshHandler instance in self already been checked by check_grid() before subset_grid is called?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants