Skip to content

Conversation

@markalle
Copy link
Contributor

This is mostly based off recent UCX additions to their patcher:
openucx/ucx#2703

They added triggers for

  • mmap when (flags & MAP_FIXED) && (addr != NULL)
  • shmat when (shmflg & SHM_REMAP) && (shmaddr != NULL)

Beyond that I noticed they already had a trigger for

  • madvise when (advice == MADV_FREE)
    that we didn't so I added that.

And the other main thing is we didn't really have shmat/shmdt
active for some systems because we only had a path for
syscall(SYS_shmdt, ) but we needed to also have a path for
syscall(SYS_ipc, IPCOP_shmdt, ) and same for shmat.

Signed-off-by: Mark Allen markalle@us.ibm.com

@markalle markalle requested a review from xinzhao3 March 27, 2019 19:26
This is mostly based off recent UCX additions to their patcher:
    openucx/ucx#2703

They added triggers for
* mmap when (flags & MAP_FIXED) && (addr != NULL)
* shmat when (shmflg & SHM_REMAP) && (shmaddr != NULL)

Beyond that I noticed they already had a trigger for
* madvise when (advice == MADV_FREE)
that we didn't so I added that.

And the other main thing is we didn't really have shmat/shmdt
active for some systems because we only had a path for
syscall(SYS_shmdt, ) but we needed to also have a path for
syscall(SYS_ipc, IPCOP_shmdt, ) and same for shmat.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
@markalle markalle force-pushed the patcher_additions branch from 9e4654d to eb88811 Compare March 29, 2019 18:39
@markalle
Copy link
Contributor Author

Thanks for the review comments, repushed

void *result = 0;

if (prot == PROT_NONE) {
if ((flags & MAP_FIXED) && (start != NULL)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW before this change, mmap(PROT_NONE) was calling release hook, and after this change - it doesn't. Is this the desired behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be okay because previously that entire section was #if 0 so mmap() wasn't being intercepted at all. The PROT_NONE was old code and was #if 0 after a decision a long time ago that mmap PROT_NONE didn't need to be trigger a cache deregistration. So when I re-added mmap() for this commit I just used the UCX code to decide what should be intercepted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i guess if flags do not have MAP+FIXED, it will not overwrite an existing mapping anyway


if (shmflg & SHM_RND) {
attach_addr -= ((uintptr_t)shmaddr) % SHMLBA;
size += ((uintptr_t)shmaddr) % SHMLBA;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it should be something like this (i will probably have to fix UCX as well):

            attach_addr -= ((uintptr_t)shmaddr) % SHMLBA;
            size        += ((uintptr_t)shmaddr) % SHMLBA;
            size        += (SHMLBA - (size % SHMLBA)) % SHMLBA;  /* Round size up */

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, thinking again, we can't really change the size. we got the size from shmctl() so it's the actual segment size and should be aligned to SHMLBA So we probably should not change 'size' at all, just 'attach_addr' - since the segment attach point just moves by %SHMLBA, but the segment size cannot really change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning toward agreeing. I agree the new mapping will have the original 'size' as its size, and it'll only occupy the range (attach_addr ..+original_size).

And when I read the manpage for REMAP I'm sure they intended to write that it replaces any mapping that existed in the address range where the new mapping goes. But they wrote more explicitly that it replaces any mapping in (addr ..+size) eg where the new mapping would go w/o RND in play.

So being logical I'd say with RND any mapping that's in (attach_addr ..+original_size) gets replaced.

I'm still maybe 5% concerned that a pedantic reading of the manpage would say that they specified the exact range that gets replaced, and the above skips a little bit on the right hand side of that range.

Is there any downside other than an unnecessary cache invalidation to my current code that uses the lower of the low ends and the higher of the high ends of the two interpretations of the manpage? My code would be telling you a certain virtual address had been invalidated when it really hadn't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's be pedantic

@markalle markalle merged commit 008ab98 into open-mpi:master May 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants