[LibOS] Use RW locks in the VMA tree #1794

dimakuv · 2024-03-05T09:44:36Z

Problem

Multi-threaded workloads with many syscalls stress the VMA subsystem a lot, because almost all syscalls verify their buffers for read/write access using the following functions:

is_user_memory_readable()
is_user_memory_writable()
is_user_string_readable()
is_user_memory_writable_no_skip()

All these functions call test_user_memory() helper:

gramine/libos/src/bookkeep/libos_signal.c

Lines 393 to 405 in 4afc550

    
           /* 
        
            * Tests whether whole range of memory `[addr; addr+size)` is readable, or, if `writable` is true, 
        
            * writable. The intended usage of this function is checking memory pointers passed to system calls. 
        
            * Note that this does not check the accesses to the memory themselves and is only meant to handle 
        
            * invalid syscall arguments (e.g. LTP test suite checks syscall arguments validation). 
        
            */ 
        
           static bool test_user_memory(const void* addr, size_t size, bool writable) { 
        
               if (!access_ok(addr, size)) { 
        
                   return false; 
        
               } 
        
               return is_in_adjacent_user_vmas(addr, size, writable ? PROT_WRITE : PROT_READ); 
        
           }

This helper in turn calls the is_in_adjacent_user_vmas() func:

gramine/libos/src/bookkeep/libos_vma.c

Lines 1204 to 1219 in 4afc550

    
           bool is_in_adjacent_user_vmas(const void* addr, size_t length, int prot) { 
        
               uintptr_t begin = (uintptr_t)addr; 
        
               uintptr_t end = begin + length; 
        
               assert(begin <= end); 
        
               struct adj_visitor_ctx ctx = { 
        
                   .prot = prot, 
        
                   .is_ok = true, 
        
               }; 
        
               spinlock_lock(&vma_tree_lock); 
        
               bool is_continuous = _traverse_vmas_in_range(begin, end, adj_visitor, &ctx); 
        
               spinlock_unlock(&vma_tree_lock); 
        
               return is_continuous && ctx.is_ok; 
        
           }

The important part is spinlock_lock(&vma_tree_lock) and spinlock_unlock(&vma_tree_lock). On a multi-threaded app, this lock contention becomes the bottleneck.

Gramine introduced a workaround to sidestep this bottleneck, via the libos.check_invalid_pointers manifest option; it translates to the g_check_invalid_ptrs variable. However, this cannot be used in all cases:

Some runtimes like Java rely on being able to check invalid pointers. Thus they cannot set libos.check_invalid_pointers = false; this would lead to Java apps failing.
The is_user_memory_writable_no_skip() function does not honor the libos.check_invalid_pointers manifest option; this is because in certain situations Gramine really must decide whether the VMA is writable or read-only, see e.g. the ppoll() case which emulates how Linux works.

Solution

Use the RW lock that was previously introduced in the Gramine codebase: https://github.com/gramineproject/gramine/blob/master/libos/include/libos_rwlock.h

Example usage of this RW lock: f071450

Benchmark results

TODO

The text was updated successfully, but these errors were encountered:

dimakuv · 2024-03-05T10:20:52Z

There is one problematic point:

The current VMA lock is a light-weight spinlock
The proposed RW VMA lock is more heavy-weight (on the slow path), as it uses PalEventSet() and PalEventWait()

Potentially, in contention cases, PalEventSet() and PalEventWait() perform a futex OCALL, which can outweigh the benefits of switching to the RW lock. On the other hand, it's typically better to sleep on a futex in a contention case. Plus these PAL APIs are optimized to elide the OCALL when possible.

dimakuv linked a pull request Mar 5, 2024 that will close this issue

[LibOS] Use RW locks in the VMA tree #1795

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LibOS] Use RW locks in the VMA tree #1794

[LibOS] Use RW locks in the VMA tree #1794

dimakuv commented Mar 5, 2024 •

edited

Loading

dimakuv commented Mar 5, 2024

[LibOS] Use RW locks in the VMA tree #1794

[LibOS] Use RW locks in the VMA tree #1794

Comments

dimakuv commented Mar 5, 2024 • edited Loading

Problem

Solution

Benchmark results

dimakuv commented Mar 5, 2024

dimakuv commented Mar 5, 2024 •

edited

Loading