Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefix all options with NETCDF_ #2895

Merged
merged 65 commits into from
Mar 20, 2024
Merged

Prefix all options with NETCDF_ #2895

merged 65 commits into from
Mar 20, 2024

Conversation

K20shores
Copy link
Contributor

Essentially I found all options that cmake defines:

grep -r "option(" . | sed -n -e 's/.*option(\([^ ]*\).*/\1/p' > variables.txt

Also replaced

  • NC_USE_STATIC_CRT -> NETCDF_USE_STATIC_CRT
  • NC_FIND_SHARED_LIBS -> NETCDF_FIND_SHARED_LIBS

Then removed some that shouldn't be modified, or needed to be modified different from others.

Then ran this script

#!/bin/bash

while IFS= read -r -u 3 var
do
    prefixed_var="NETCDF_$var"
    echo "Replacing $var with $prefixed_var..."
    grep --exclude=variables.txt -rl "$var" . | xargs sed -i '' "s/$var/$prefixed_var/g"
    echo "Press enter to continue to the next variable..."
    read
    git add .
    git commit -m "Replace $var with $prefixed_var"
done 3< variables.txt

In general it worked, but I had to go through each commit by hand to remove the script file from git.

@LecrisUT
Copy link

@LecrisUT forgive me, but I don't quite understand the utility of that. This PR can be squashed on merge to reduce the number of commits if that's what you're after.

It's for the reviewer, because on the files changed tab, we can select specific commits to view, and basically all of the renaming changes in non-CMakeLists.txt files are trivial.

@WardF
Copy link
Member

WardF commented Mar 19, 2024

This looks good; there's a change I'm considering, although I'd be happy to make it (simple string replace, it's no problem). We typically abbreviate netCDF as 'NC'. I'm not sure if replacing NETCDF_ with NC_ would violate some best practices, or cause a known namespace collision. I'm soliciting feedback from anybody who would care to give some. Thanks!

WardF
WardF previously approved these changes Mar 19, 2024
@DennisHeimbigner
Copy link
Collaborator

BTW, remember that many of the current option names are used in the C code as well via config.h.
So renaming those options will either require config.h changes or maintaining the original name
in cmake.

@LecrisUT
Copy link

I'm not sure if replacing NETCDF_ with NC_ would violate some best practices, or cause a known namespace collision.

IMO, only 2 letters is a bit risky w.r.t. namespace collision.

I would say, that within this PR or release preparation, it is good to clean up the options, e.g. ENABLE_NETCDF4, ENABLE_NETCDF_4, ENABLE_HDF5

BTW, remember that many of the current option names are used in the C code as well via config.h.
So renaming those options will either require config.h changes or maintaining the original name
in cmake.

That is trivial. The usual design is to have a if (NETCDF_ENABLE_NCZARR) -> set(ENABLE_NCZARR) (local variable). I put those in a mock_autotools.cmake file. But I think the find-and-replace was so aggressive that it covered autotools and source files as well?

@DennisHeimbigner
Copy link
Collaborator

When you make changes to the CMAKE options, please look at the same option
in configure.ac. There is documentation in configure.ac that is not (currently)
in CMakeLists.txt.

@edwardhartnett
Copy link
Contributor

This will break hundreds of workflows around the world. Is there a really good reason to do that?

NetCDF options are complex, compared to most other science libraries. Most institutions do not have a great handle on the netCDF build system - they figured out the options that worked, and they coded them into shell scripts, etc.

Now all the options will change name? Everyone will have to drop what they want to do, and once again spend time figuring out the netCDF build. What is the benefit that will accrue from all that work?

For example, has anyone communicated with the spack maintainers about this?

@LecrisUT
Copy link

@K20shores, btw, there are also the set(CACHE) commands that need to be adapted, e.g. DEFAULT_CHUNK_SIZE.

There is documentation in configure.ac that is not (currently) in CMakeLists.txt.

Good point. I see that all NETCDF4 options should be removed during this refactor.

netcdf-c/configure.ac

Lines 137 to 145 in 7babdcb

AC_ARG_ENABLE([netcdf4], [AS_HELP_STRING([--disable-netcdf4],
[(Deprecated) Synonym for --enable-hdf5)])])
test "x$enable_netcdf4" = xno || enable_netcdf4=yes
AC_MSG_RESULT([$enable_netcdf4 (Deprecated) Please use --disable-hdf5)])
AC_MSG_CHECKING([whether we should build with netcdf-4 (alias for HDF5)])
AC_ARG_ENABLE([netcdf-4], [AS_HELP_STRING([--disable-netcdf-4],
[(synonym for --disable-netcdf4)])])
test "x$enable_netcdf_4" = xno || enable_netcdf_4=yes
AC_MSG_RESULT([$enable_netcdf_4])

if (DEFINED $CACHE{ENABLE_NETCDF4})
message(WARNING ...)

(the usage of WARNING is preferred there because of name-clashing)

This will break hundreds of workflows around the world. Is there a really good reason to do that?

Yes, because it limits the potential of the project to be used by others via add_subdirectory, FetchContent, etc., case-in-point the ENABLE_NETCDF4 above.

Another issue is many of the options were unnecessary in the first place, and going through them here will make it easier to comunicate what the options should be designed as independent of the autotools nomenclature and constraints. E.g. most of the ENABLE_* can be replaced with find_package (without REQUIRED) and the user can disable those via built-in CMAKE_DISABLE_FIND_PACKAGE_*.

Most institutions do not have a great handle on the netCDF build system

The proper way to address this is to provide presets that will handle most cases, e.g. the sets of plugin options. Than the user would just call something like

$ cmake --preset default-with-plugin -DNETCDF_ENABLE_NCZARR_FILTERS=OFF --install-prefix=...

Also note that it is possible to maintain compatibility as shown in the message(WARNING) section above, where the packagers should™️ pay attention to and adapt.

@K20shores
Copy link
Contributor Author

This will break hundreds of workflows around the world. Is there a really good reason to do that?

NetCDF options are complex, compared to most other science libraries. Most institutions do not have a great handle on the netCDF build system - they figured out the options that worked, and they coded them into shell scripts, etc.

Now all the options will change name? Everyone will have to drop what they want to do, and once again spend time figuring out the netCDF build. What is the benefit that will accrue from all that work?

For example, has anyone communicated with the spack maintainers about this?

That's an understandable concern. But, as a leading scientific package, netcdf should confirm to good best practices, and providing build options that don't conflict with downstream packages seems to be a good goal. Further, this makes it safer to include netcdf with cmake's fetch content functionality, meaning that there is yet another way to include netcdf in a package.

As far as breaking builds, that seems to be unavoidable pain. But it isn't that complicated to fix. Cmake will warn of unused options (which would be all of them). Any sort of grepish command will reveal that the option to be updated only needs NETCDF_ preppended, or to change NC_USE_STATIC_CRT to NETCDF_USE_STATIC_CRT and NC_FIND_SHARED_LIBS to NETCDF_FIND_SHARED_LIBS. Further, only one option is removed NETCDF_ENABLE_NETCDF4, which will issue an error if used and tells the user to use NETCDF_ENABLE_NETCDF_4 instead. Issues, yes, but surmountable.

A further mitigation step could be to include this PR only after the 4.9.3 release and then release 4.10.0 with the only change being what's included in this PR. Those release notes would make it clear that build options are changing for any future version of netcdf.

As for spack, if these changes are deemed acceptable, I could make a draft PR into spack that would accommodate these changes and publish it when a release of netcdf with these changes goes out.

Of course, the option to abandon this PR is on the table. The technical answer is that it's not necessary, but it is nice to have these changes, from my perspective. However, the downstream changes would definitely be annoying

@LecrisUT
Copy link

As for spack, if these changes are deemed acceptable, I could make a draft PR into spack that would accommodate these changes and publish it when a release of netcdf with these changes goes out.

Also, speaking as a Fedora packager, I am looking forward to these changes

@WardF
Copy link
Member

WardF commented Mar 19, 2024

For what it's worth, I've got a work-around that will let us deprecate existing options without otherwise breaking the build. I'll push the fix to this branch. I have no doubt that it will be a blunt hammer approach, but that's why these contributions from experts are so vital and appreciated :).

I'll push something today or tomorrow, I need to construct the list of deprecated options and do some testing, but so far it appears to be working for the handful of options I've tested.

Ignore the various 'message' lines that I've been using for debugging as I step through, but this gives a rough idea of the approach I'm taking. This is a new file, cmake/deprecated.cmake, which is included near the top of the root-level CMakeLists.txt file:

function(check_dep_arg arg)

    if(DEFINED ${arg})
        message(STATUS "arg: ${arg} - ${${arg}}")
        set(val ${${arg}})
        MESSAGE("val: ${val}")
        message(WARNING "${arg} is deprecated and will be removed. Please use NETCDF_${arg} in the future")
        set(NETCDF_${arg} ${val} CACHE INTERNAL "" FORCE)
        MESSAGE(STATUS "Setting NETCDF_${arg} to ${val}")
        MESSAGE(STATUS "Result: NETCDF_${arg} is ${NETCDF_${arg}}")
    endif() 
endfunction()

MESSAGE(STATUS "Checking for Deprecated Options")
check_dep_arg(ENABLE_DAP)
check_dep_arg(ENABLE_DAP2)
check_dep_arg(ENABLE_DAP4)

(....)

@DennisHeimbigner
Copy link
Collaborator

Good fix. We still need to deal with the config.h problem.

@WardF
Copy link
Member

WardF commented Mar 19, 2024

BTW, remember that many of the current option names are used in the C code as well via config.h. So renaming those options will either require config.h changes or maintaining the original name in cmake.

Taking a quick look, it appears that this has already been addressed. Re: parity between configure and cmake options; as long as we can keep parity via naming convention, e.g. option names are the same but the cmake build system requires NETCDF_ preface for the many reasons outlined above, that feels ok to me. The prefix is not compatible with the autotools-based build system nomenclature anyways, and it would be difficult to shoehorn it in.

Allowing netCDF to be included trivially in other projects is a fantastic strength that will provide a lot of direct and indirect benefits. It also makes dropping autotools a realistic option, down the line. Well, more realistic at any rate. It won't happen anytime soon, however.

I'll push my changes into this branch tomorrow at the latest, I need to run down the list of options that were changed, clean up debugging messages, etc. etc.

cmake/deprecated.cmake Outdated Show resolved Hide resolved
cmake/deprecated.cmake Outdated Show resolved Hide resolved
cmake/deprecated.cmake Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
cmake/deprecated.cmake Show resolved Hide resolved
@WardF
Copy link
Member

WardF commented Mar 20, 2024

Thank you for the comments @LecrisUT; you have referenced a fair bit of cmake functionality I am unfamiliar with, and I will take a look in the morning. We need to get v4.9.3 out in short order, so I will have to prioritize what time I have with an eye towards roadblocks to the release. I will double check the use of FORCE in the set; my initial testing with PARENT_SCOPE failed to work, hence the 'debug using MESSAGE' approach that crept in. I'll take a look at that tomorrow and see if perhaps I had been invoking it incorrectly. Thanks again, I'm sure the straightforward changes will find their way in shortly, with the rest to follow. I am very grateful for the help you, @K20shores and @ZedThree have been in providing these improvements!

@WardF WardF merged commit 37e6b92 into Unidata:main Mar 20, 2024
103 checks passed
@DennisHeimbigner
Copy link
Collaborator

THis PR has a big error in it.
You modified some but not all of the AC_DEFINEs in configure.ac
to change AC_DEFINE([ENABLE_XXX] ...) to AC_DEFINE([NETCDF_ENABLE_XXX]...).
THis was unnecessary since the C preprocessor flags are not visible outside
the netcdf-c code.
Ditto for AC_CONDITIONAL and AC_SUBST (and maybe others).

@DennisHeimbigner
Copy link
Collaborator

Also, the reversion needs to occur in config.h.in and config.h.cmake.in

@DennisHeimbigner
Copy link
Collaborator

Also in all the Makefile.am files.

@DennisHeimbigner
Copy link
Collaborator

Also lib_flags.am

@DennisHeimbigner
Copy link
Collaborator

I really wish you had not this. It was unnecessary.
It changes every code file we have. It completely breaks the chain
of diffs. Ugh!

@K20shores K20shores deleted the prefix_options branch March 22, 2024 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants