Description
Description:
MADV_DOEXEC and MADV_DONTEXEC should be redefined within a separate numerical range otherwise break binary compatibility with mainline kernels.
Diagnostic Info:
Kernel v5.4 build from uek6/u3 has a commit a91ae4f, which had two madvise mode numbers defined as below:
#define MADV_DOEXEC 22 /* do inherit across exec */
#define MADV_DONTEXEC 23 /* don't inherit across exec */
In comparison, Linux mainline v5.4 does not have such definitions, the top number was 21, see v5.4 219d54332
Moving forward, Linux mainline started to use number 22 and 23, at its 5.14-rc1 timeframe, see torvalds/linux@4ca9b38.
#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */
#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */
According to Linux man-pages, the way to tell whether MADV_POPULATE_WRITE is supported on a testing system is:
MADV_POPULATE_WRITE (since Linux 5.14)
madvise(0, 0, advice) will return zero iff advice is supported by the kernel and can be relied on to probe for support.
As a result, when we do a syscall madvise(0, 0, 23)
on UEKR6 v5.4.17 kernel will return 0 which means supported
, while Linux v5.4 mainline returns -1
that means not-supported
. The duplicate definition breaks the binary compatibility.
This issue is currently causing a practical failure on OpenJDK. See the ticket JDK-8324776 and discussion upon for details.
Other issue:
Kernel v5.15 on uek7/u2 has a similar problem. The commit 4693c5d integrated the defs of 22 and 23 from Linux mainline, while used 24 and 25 for the two customized mode numbers.
#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */
#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */
#define MADV_DOEXEC 24 /* do inherit across exec */
#define MADV_DONTEXEC 25 /* don't inherit across exec */
This created another incompatibility against Linux mainline's mode MADV_DONTNEED_LOCKED
24 introduced by torvalds/linux@9457056 since v5.18-rc1, and mode MADV_COLLAPSE
25 added by torvalds/linux@7d8faaf since v6.1-rc1.
See details at mainline b401b621:
#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */
#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */
#define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */
#define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */
Proposed changes:
Redefine the customized modes MADV_DOEXEC
and MADV_DONTEXEC
within a separate numerical range, for example 101, 102.
As such it can avoid binary compatibility broken issues, UEK6 and UEK7 can also have same definitions of these two modes, and future UEKs do not need to move them to any new numbers, better for maintenance.
Any other similar or workable solution is also acceptable.