Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "make distcheck" to github action CI #844

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

wkliao
Copy link
Contributor

@wkliao wkliao commented Nov 2, 2022

No description provided.

@github-actions github-actions bot added the CI continuous integration label Nov 2, 2022
@wkliao
Copy link
Contributor Author

wkliao commented Nov 2, 2022

CI failed at

AC_CHECK_HEADER([papi.h],
[with_papi=-lpapi],
[AC_MSG_ERROR([Cannot find papi header required for Autoperf XC module])],
[])

which requires papi.h.

Shouldn't papi be a submodule?

@shanedsnyder
Copy link
Contributor

The APXC module is only intended for Cray XC systems for the darshan-runtime build, so we should avoid building it in CI probably.

For that reason, I think we need to keep the builds of darshan-runtime/darshan-util separate as before. Is it possible to run make distcheck independently for each darshan-runtime/darshan-util? Or does it have to be ran from the top-level of the repo as your changes have?

If make distcheck has to be run from the top-level, I think we need to invoke configure and make install twice, once for darshan-runtime (with APXC module disabled) and once for darshan-util (with all modules enabled).

@shanedsnyder
Copy link
Contributor

If make distcheck has to be run from the top-level, I think we need to invoke configure and make install twice, once for darshan-runtime (with APXC module disabled) and once for darshan-util (with all modules enabled).

Should have added more details, but you can disable portions of the build using --disable-darshan-runtime or --disable-darshan-util configure options.

@wkliao
Copy link
Contributor Author

wkliao commented Nov 2, 2022

I am seeing --enable-apxc-mod is used when configuring darshan-util.
Can this be removed from CI?

../configure --prefix=$DARSHAN_INSTALL_PATH --enable-apxc-mod --enable-apmpi-mod

../configure --prefix=$DARSHAN_INSTALL_PATH --enable-apxc-mod --enable-apmpi-mod

@tylerjereddy
Copy link
Collaborator

Can this be removed from CI?

I think the APXC stuff is used for darshan-util/pydarshan testing, so should probably remain. The diff seems pretty complicated here at the moment, I guess I was hoping we'd just be able to run that command in isolation without altering the workflow much.

@wkliao
Copy link
Contributor Author

wkliao commented Nov 2, 2022

Option --enable-apxc-mod in darshan-util/configure.ac is currently
used to check only for the file existence of darshan-apmpi-log-format.h.
It does not check papi.h.
The same option checks papi.h in darshan-runtime/configure.ac.
The current CI is inconsistent in that it does not use this option when
building darshan-runtime, but only darshan-util.

@carns
Copy link
Contributor

carns commented Nov 2, 2022

If I'm not mistaken, --enable-apxc-mod on the utils portion simply adds the ability to parse logs that have apxc data in them, so it probably just needs some definitions from the autoperf git submodule. It should be possible to configure/compile this part most anywhere, I think.

The runtime portion, on the other hand, has PAPI and likely some XC header dependencies because it has to actually be able to interact with system counters for runtime instrumentation. This should only be enabled on an actual Cray system.

@shanedsnyder
Copy link
Contributor

That's correct, Phil.

That's why the CI workflow builds darshan-runtime and darshan-util separately, since they require different options. Note that the existing CI workflow is not using the top-level configure at all, it's using configure located in darshan-runtime and darshan-util to build each component, respectively.

Easiest path forward would be to invoke make distcheck on each of those independently, if that's possible. (i.e., run make distcheck once for the "Install darshan-runtime" build and once for for the "Install darshan-util" build).

@wkliao
Copy link
Contributor Author

wkliao commented Nov 4, 2022

Can this be removed from CI?

I think the APXC stuff is used for darshan-util/pydarshan testing, so should probably remain. The diff seems pretty complicated here at the moment, I guess I was hoping we'd just be able to run that command in isolation without altering the workflow much.

Looks like papi.h is not used in darshan-util. In this case, I wonder why not making the option --enable-apxc-mod always on in darshan-util or automatically enabled/disable when the header file darshan-apxc-log-format.h from the autoperf submodule can/cannot be found.

@wkliao
Copy link
Contributor Author

wkliao commented Nov 4, 2022

Easiest path forward would be to invoke make distcheck on each of those independently, if that's possible. (i.e., run make distcheck once for the "Install darshan-runtime" build and once for for the "Install darshan-util" build).

Running make distcheck is to check the distribution, as a single package.
Running it in subdirectories if you intent to make separate distributions for darshan-runtime and darshan-util.

In the current design, if a user wants to build darshan to read autoperf data, but not write one, then 2 separate builds of runtime and util are required. Maybe this is a very rare case?

@shanedsnyder
Copy link
Contributor

Can you guys have a look and see if #853 seems okay, building off the changes and discussion here?

@wkliao wkliao force-pushed the distcheck branch 3 times, most recently from 62594bf to c23a680 Compare November 24, 2022 20:41
@wkliao
Copy link
Contributor Author

wkliao commented Dec 2, 2022

I made some changes to this PR, so we can run make dist and make distcheck at the root folder, instead of darshan-runtime and darshan_util separately. This makes sense to me if we intent to make releases containing both darshan-runtime and darshan-util, instead of two separate tar balls.

Other noticeable changes are

  1. to automatically detect the header and library files required by apmpi and apxc and thus to enable or disable those features,
  2. to add a new workflow yaml file .github/workflows/runtime_util.yml to run all tests at root folder

@shanedsnyder
Copy link
Contributor

Other noticeable changes are

1. to automatically detect the header and library files required by apmpi and apxc and thus to enable or disable those features,

This is problematic, particularly on the darshan-runtime side. We want AutoPerf instrumentation modules to be opt-in (i.e., disabled by default, optionally enabled at configure time). For releases, I have always included the AutoPerf instrumentation modules in the distribution, so checking for their existence is not indicative that users want them enabled. I'm less concerned about this issue for darshan-util, as the impact is less (for darshan-runtime, automatically enabling AutoPerf will result in a lot more information being collected by Darshan by default).

@wkliao
Copy link
Contributor Author

wkliao commented Jan 25, 2023

If AutoPerf is enabled at the configure time. can users turn it on and off at the run time (e.g. through setting an env variable)?

The reason I asked is if AutoPerf will always be included in the distribution, then I wonder why not just build it by default and let users to set the env variable to enable/disable at run time.

@shanedsnyder
Copy link
Contributor

If AutoPerf is enabled at the configure time. can users turn it on and off at the run time (e.g. through setting an env variable)?

Yes, users could turn it off at runtime if it was enabled at configure time as you mention. But, we don't want all users to have to deal with that extra explicit step -- our current stance is that AutoPerf support should be opt-in so that it isn't automatically enabled on Darshan deployments by default. APMPI module in particular will instrument basically all MPI communication calls, which we are apprehensive about being the default instrumentation method in production for overhead reasons.

@wkliao
Copy link
Contributor Author

wkliao commented Jan 27, 2023

If Darshan is always configured with AutoPerf feature, but by default does not collect the MPI communication activity unless users set an env variable to explicitly enable it, then wouldn't this be the best for the production systems?

@shanedsnyder
Copy link
Contributor

If Darshan is always configured with AutoPerf feature, but by default does not collect the MPI communication activity unless users set an env variable to explicitly enable it, then wouldn't this be the best for the production systems?

I suppose, but it would complicate our existing semantics which mostly just try to be simple. I'm sure users could be confused if they explicitly enable something at configure time, but then also have to always request it at runtime, too. The deployment scenario you describe is possible anyways, it would just require the installer to enable AutoPerf at configure time and to set some env variables (that are set when the Darshan module is loaded) that disable AutoPerf.

Ultimately, I don't think I want to change/complicate our existing semantics just to simplify these CI tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI continuous integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants