Skip to content

Conversation

astamm
Copy link

@astamm astamm commented Apr 17, 2025

This is useful for providing nlopt to the R community while satisfying CRAN requirements.

Specifically, CMakeLists.txt gains the option USE_R_COMPATIBILITY:

option (USE_R_COMPATIBILITY "compile with R compatibility functions only" OFF)

which, when turned ON, triggers the following definition:

if (USE_R_COMPATIBILITY)
  add_definitions(-DCRAN_COMPATIBILITY)
endif ()

The variable CRAN_COMPATIBILITY is then used in source files to prevent the use of functions forbidden by CRAN such as stderr, abort, printf.

This PR provides the necessary fixes for the C API. There are also known uses of std::cerr and std::cout notably in the STOGO implementation that could also be avoided using CRAN_COMPATIBILITY. It might later be good to make also the C++ API R-safe.

@astamm
Copy link
Author

astamm commented Apr 17, 2025

@aadler I noticed you forked nlopt as well last month. Just pinging you into this PR in case you had the same idea.

@astamm
Copy link
Author

astamm commented Apr 17, 2025

This PR should address #604.

@astamm
Copy link
Author

astamm commented Apr 22, 2025

Last commit brings a new FindR.cmake file to properly set notably R_INCLUDE_DIR so that, instead of disabling verbosity and error handling, we can replace the R-incompatible functions with their R-compatible counterparts (e.g. Rprintf(), REprintf(), Rf_error()).

@astamm
Copy link
Author

astamm commented Apr 22, 2025

So instead of a simple flag option of the type NLOPT_R = ON/OFF, the CMakeLists.txt now gains a variable NLOPT_R_BINDIR which is empty by default:

set (NLOPT_R_BINDIR "" CACHE STRING "Directory containing the R executable")

When set to the directory containing the R executable, it triggers:

if (NOT "${NLOPT_R_BINDIR}" STREQUAL "")
  find_package(R REQUIRED)
  target_include_directories(${nlopt_lib} PRIVATE ${R_INCLUDE_DIR})
  target_compile_definitions(${nlopt_lib} PRIVATE NLOPT_R)
endif ()

which makes all headers from R's C API available to nlopt code and defines NLOPT_R which can be used to switch from e.g. printf() to Rprintf() in the source code.

Copy link
Contributor

@aitap aitap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that the perfect is the enemy of the good, but since it seems that nlopt needs to go over its [f]printfs anyway, wouldn't it be better served by a layer of indirection? The optimisation methods would call an internal interface nlopt_log(opt, "format", args...), and the caller of the library would supply appropriate callbacks (such as Rprintf) to make sure that the user sees the output. The defaults for the callback (printf) could be disabled at compile time. This would avoid duplication. (Imagine if another interface needs to introduce an #ifndef NLOPT_SOME_OTHER_LANGUAGE in the same place for the same reason.)

Calling R[E]printf is not exactly safe, by the way. For example, on Rgui.exe, this checks for interrupts and would longjmp() out of the calling frame if an interrupt is found. This will at least leak memory, or wreak complete havoc if code ever becomes multi-threaded.

find_package(R REQUIRED)
target_include_directories(${nlopt_lib} PRIVATE ${R_INCLUDE_DIR})
target_compile_definitions(${nlopt_lib} PRIVATE NLOPT_R)
endif ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this link on Windows without an explicit reference to R.dll?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will double check but I am pretty sure I checked against win-builder machines. I will do it again to be sure.

else()
set(R_PATH_LOC "${R_HOME_TEXT}/lib")
MESSAGE(STATUS "R_PATH_LOC = ${R_PATH_LOC}")
find_library(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least on Linux, R defaults to --disable-R-shlib, so there won't be a libR.so in R.home('lib'). (But a distro-built R will usually be compiled with --enable-R-shlib because distros have a strong preference for dynamic linking.) This won't prevent you from linking and loading a shared library (because the dynamic loader will resolve the symbols from the ones exported by the R executable), but any example applications will fail to link.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this feedback. Do you know how I could improve the FindR.cmake file to circumvent this issue?

#ifndef NLOPT_R
#define ASRT(c) if (!(c)) { fprintf(stderr, "DIRECT assertion failure at " __FILE__ ":%d -- " #c "\n", __LINE__); exit(EXIT_FAILURE); }
#else
#define ASRT(c) if (!(c)) { Rf_error("DIRECT assertion failure at " __FILE__ ":%d -- " #c "\n", __LINE__); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rf_error would longjmp out of the call frame, bypassing any destructors owned by nlopt and your own destructor. Could there be a way to return an error code from the function instead of this? Still, leaking some memory is better than crashing the process.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I wanted to avoid the need to call for Rcpp or cpp11 functions within nlopt codebase because this would make a forced choice for R users to use one of them.

But I think it could make sense to force the use of cpp11 as I recall they carefully dealt with longjmps.

extern const char *nlopt_set_errmsg(nlopt_opt opt, const char *format, ...)
#ifdef __GNUC__
#ifndef NLOPT_R
__attribute__ ((format(printf, 2, 3)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is needed? R CMD check goes through the names of the imports of the package shared library and compares them to entries of tools:::so_symbol_names_table, it doesn't look at the attributes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not needed indeed.

@astamm
Copy link
Author

astamm commented Aug 5, 2025

I understand that the perfect is the enemy of the good, but since it seems that nlopt needs to go over its [f]printfs anyway, wouldn't it be better served by a layer of indirection? The optimisation methods would call an internal interface nlopt_log(opt, "format", args...), and the caller of the library would supply appropriate callbacks (such as Rprintf) to make sure that the user sees the output. The defaults for the callback (printf) could be disabled at compile time. This would avoid duplication. (Imagine if another interface needs to introduce an #ifndef NLOPT_SOME_OTHER_LANGUAGE in the same place for the same reason.)

I agree with you. However, there are already existing bindings for many languages. I do not know how these have been achieved in the past. A reflection on this together with a layer of indirection to accommodate any past and future languages (maybe Julia aficionados) could be a long-term project on which we could work as a small group.

On a shorter term, nloptr does not pass CRAN checks anymore because in the case of it being built from included sources (which are the latest 2.10), CRAN now checks for calls to these functions (cerr, cout and co) and issues warning if these are used. In order to keep the included sources aligned with the main branch of nlopt, could we consider acceptable the currently proposed solution since it does not alter anyone's code unless (s)he specifically turn USE_R_COMPATIBILITY on?

Of course, I will answer the questions you raised in your review first.

Calling R[E]printf is not exactly safe, by the way. For example, on Rgui.exe, this checks for interrupts and would longjmp() out of the calling frame if an interrupt is found. This will at least leak memory, or wreak complete havoc if code ever becomes multi-threaded.

You are right indeed. I think posit folks have come up in their cpp11 package with ways to deal with this. I will investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants