Skip to content

MADV_DOEXEC and MADV_DONTEXEC should be redefined within a separate numerical range #23

Closed
@cnqpzhang

Description

@cnqpzhang

Description:

MADV_DOEXEC and MADV_DONTEXEC should be redefined within a separate numerical range otherwise break binary compatibility with mainline kernels.

Diagnostic Info:

Kernel v5.4 build from uek6/u3 has a commit a91ae4f, which had two madvise mode numbers defined as below:

#define MADV_DOEXEC	22		/* do inherit across exec */
#define MADV_DONTEXEC	23		/* don't inherit across exec */

In comparison, Linux mainline v5.4 does not have such definitions, the top number was 21, see v5.4 219d54332

Moving forward, Linux mainline started to use number 22 and 23, at its 5.14-rc1 timeframe, see torvalds/linux@4ca9b38.

#define MADV_POPULATE_READ	22	/* populate (prefault) page tables readable */
#define MADV_POPULATE_WRITE	23	/* populate (prefault) page tables writable */

According to Linux man-pages, the way to tell whether MADV_POPULATE_WRITE is supported on a testing system is:

MADV_POPULATE_WRITE (since Linux 5.14)
madvise(0, 0, advice) will return zero iff advice is supported by the kernel and can be relied on to probe for support.

As a result, when we do a syscall madvise(0, 0, 23) on UEKR6 v5.4.17 kernel will return 0 which means supported, while Linux v5.4 mainline returns -1 that means not-supported. The duplicate definition breaks the binary compatibility.

This issue is currently causing a practical failure on OpenJDK. See the ticket JDK-8324776 and discussion upon for details.

Other issue:

Kernel v5.15 on uek7/u2 has a similar problem. The commit 4693c5d integrated the defs of 22 and 23 from Linux mainline, while used 24 and 25 for the two customized mode numbers.

#define MADV_POPULATE_READ	22	/* populate (prefault) page tables readable */
#define MADV_POPULATE_WRITE	23	/* populate (prefault) page tables writable */
#define MADV_DOEXEC	24		/* do inherit across exec */
#define MADV_DONTEXEC	25		/* don't inherit across exec */

This created another incompatibility against Linux mainline's mode MADV_DONTNEED_LOCKED 24 introduced by torvalds/linux@9457056 since v5.18-rc1, and mode MADV_COLLAPSE 25 added by torvalds/linux@7d8faaf since v6.1-rc1.

See details at mainline b401b621:

#define MADV_POPULATE_READ	22	/* populate (prefault) page tables readable */
#define MADV_POPULATE_WRITE	23	/* populate (prefault) page tables writable */
#define MADV_DONTNEED_LOCKED	24	/* like DONTNEED, but drop locked pages too */
#define MADV_COLLAPSE	25		/* Synchronous hugepage collapse */

Proposed changes:

Redefine the customized modes MADV_DOEXEC and MADV_DONTEXEC within a separate numerical range, for example 101, 102.

As such it can avoid binary compatibility broken issues, UEK6 and UEK7 can also have same definitions of these two modes, and future UEKs do not need to move them to any new numbers, better for maintenance.

Any other similar or workable solution is also acceptable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions