Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI failures due to missing MPI compilers? #419

Closed
edwardhartnett opened this issue Mar 17, 2021 · 24 comments · Fixed by #427
Closed

CI failures due to missing MPI compilers? #419

edwardhartnett opened this issue Mar 17, 2021 · 24 comments · Fixed by #427
Assignees
Labels

Comments

@edwardhartnett
Copy link
Collaborator

@kgerheiser any idea what is going on here? CI seems to be messed up..

-- Could NOT find MPI_Fortran (missing: MPI_Fortran_WORKS) 
55
CMake Error at /usr/local/Cellar/cmake/3.19.6/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:218 (message):
56
  Could NOT find MPI (missing: MPI_C_FOUND MPI_Fortran_FOUND)
57

58
      Reason given by package: MPI component 'CXX' was requested, but language CXX is not enabled.  
59
-- Configuring incomplete, errors occurred!
@kgerheiser
Copy link
Contributor

kgerheiser commented Mar 17, 2021

Since we do find_package(MPI REQUIRED) without specifying the language, I think it's looking for MPI C++ which isn't being found because we don't have that language enabled in the project(). Specifying find_package(MPI REQUIRED C Fortran) might fix it. I wonder if CMake was updated on the runner and this is a new error message?

That didn't work.

@edwardhartnett
Copy link
Collaborator Author

Well this is preventing any merges, so what shall we do here?

@edwardhartnett
Copy link
Collaborator Author

@kgerheiser any ideas how to proceed?

@kgerheiser
Copy link
Contributor

I tried enabling C++ in the project to see if that has any affect. It still fails. I guess it really can't find MPI for some reason. I'm not sure what the problem is.

@kgerheiser
Copy link
Contributor

I see this warning from Homebrew when installing Mpich

https://github.com/kgerheiser/UFS_UTILS/runs/2148896474?check_suite_focus=true#step:2:33

@edwardhartnett
Copy link
Collaborator Author

@aerorahul any thoughts as to why our CI suddenly stopped working on macs?

We are getting this error from UFS_UTIL cmake build:

CMake Error at /usr/local/Cellar/cmake/3.19.6/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:218 (message):
71
  Could NOT find MPI (missing: MPI_C_FOUND MPI_Fortran_FOUND)
72

73
      Reason given by package: MPI component 'CXX' was requested, but language CXX is not enabled.  
74

75
Call Stack (most recent call first):
76
  /usr/local/Cellar/cmake/3.19.6/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:582 (_FPHSA_FAILURE_MESSAGE)
77
  /usr/local/Cellar/cmake/3.19.6/share/cmake/Modules/FindMPI.cmake:1722 (find_package_handle_standard_args)
78
  CMakeLists.txt:49 (find_package)
79

80

81
-- Configuring incomplete, errors occurred!
82
See also "/Users/runner/work/UFS_UTILS/UFS_UTILS/ufs_utils/build/CMakeFiles/CMakeOutput.log".
83
See also "/Users/runner/work/UFS_UTILS/UFS_UTILS/ufs_utils/build/CMakeFiles/CMakeError.log".
84
Error: Process completed with exit code 1.

@edwardhartnett
Copy link
Collaborator Author

I added mpicc --version and mpifort --version commands before we attempt to build, and both are present on the machine. So what is wrong with our CMake?

+ mpicc --version
19
+ mpifort --version
20
gcc-10 (Homebrew GCC 10.2.0_4) 10.2.0
21
Copyright (C) 2020 Free Software Foundation, Inc.
22
This is free software; see the source for copying conditions.  There is NO
23
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
24

25
GNU Fortran (Homebrew GCC 10.2.0_4) 10.2.0
26
Copyright (C) 2020 Free Software Foundation, Inc.
27
This is free software; see the source for copying conditions.  There is NO
28
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
29

@aerorahul
Copy link
Contributor

really quickly:
find_package(MPI) will find all the languages in the project LANGUAGE.
It seems you may be needing the CXX MPI library but is not being searched because the project does not require it.
Two suggestions:

  1. Add CXX to the project LANGUAGE
  2. Figure out which program is looking for it and is it really needed.

@aerorahul
Copy link
Contributor

Looks like @kgerheiser tried that.
I can try building on my mac today and let you know.

@kgerheiser
Copy link
Contributor

@edwardhartnett I was going to try that, but it seems like MPI is working.

@aerorahul I tried enabling C++, but that doesn't fix it.

I'm installing OpenMPI instead of MPICH and we'll see if that works.

@kgerheiser
Copy link
Contributor

Using OpenMPI instead of MPICH works.

https://github.com/kgerheiser/UFS_UTILS/actions/runs/668152558

@edwardhartnett
Copy link
Collaborator Author

edwardhartnett commented Mar 19, 2021

OK, good that we got OpenMPI working. But we also need MPICH to work. ;-) (Just added an issue for this #425 ).

@kgerheiser
Copy link
Contributor

Tried using --build-from-source for MPICH in Homebrew, but it still failed with the same error.

@edwardhartnett
Copy link
Collaborator Author

Here's a workflow in which I build mpich from source (without homebrew being involved). Maybe use that? https://github.com/NCAR/ParallelIO/blob/master/.github/workflows/netcdf-4.7.4_hdf5-1.12.0_pnetcdf-12.2_ncint_mpich-3.3_asan.yml

@kgerheiser
Copy link
Contributor

Trying the MPICH build now

@edwardhartnett
Copy link
Collaborator Author

I note the more primitive methods of the PIO builds (which don't use matrices well withing the github YML) have the advantage that even if one build fails, the others work. In the case of these builds, a failure on MacOS causes all tests to fail. It's not something we should change, just interesting to note...

@kgerheiser
Copy link
Contributor

Yeah, I've noticed that. I don't like how it cancels all jobs in the workflow (Ubuntu) if one of them fails.

@edwardhartnett
Copy link
Collaborator Author

edwardhartnett commented Mar 22, 2021

On the other hand, it lends urgency. Nothing moves forward until everything is fixed. ;-)

image

@edwardhartnett
Copy link
Collaborator Author

Did we ever figure out why the stock mpi didn't work?

@kgerheiser
Copy link
Contributor

I'm not sure.

@edwardhartnett
Copy link
Collaborator Author

What is the current state of play?

  • Did we add an OpenMPI based test?
  • Did we add an MPICH built from scratch test?
  • Did we take away the MPICH from package manager?

Really, we need all of these to work. Also we need to make clear what version of MPICH we are building.

@kgerheiser
Copy link
Contributor

MPICH is installed from source instead of Homebrew on macOS. I'm adding OpenMPI as a separate PR.

@edwardhartnett
Copy link
Collaborator Author

OK do we still have an ubuntu build using mpich from the package management system?

@kgerheiser
Copy link
Contributor

Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants