-
Notifications
You must be signed in to change notification settings - Fork 0
Implement probing hashtable-based array domains for Landlock #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: landlock-arraydomain
Are you sure you want to change the base?
Conversation
|
@copilot I literally said "targets the "landlock-arraydomain" branch" |
You're absolutely right, I apologize for the mistake. I should have targeted the "landlock-arraydomain" branch as you specifically requested. Let me switch to that branch and implement the hashtable functionality on top of the existing flat array implementation there. Warning Firewall rules blocked me from connecting to one or more addressesI tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
You said: Is this true? What if all the indices hash to high hash values, and you end up with a bunch of unused slots at the beginning? But also, even if you allow wrapping, what happens if you use a "free" slot first, but then another rule hashed to the slot you just used to store collision? |
You're absolutely correct on both points. The original logic was flawed:
The real issue was the table-full detection logic. I've fixed it in commit a176974 by:
The fix ensures we properly detect when we've examined all |
|
For landlock, the vast majority of cases there will not be any match (e.g. an application is unlikely to set individual rules on individual files). Ideally those cases should return "not found" as fast as possible. But with this probing approach, we will end up iterating through the entire hashtable (to make sure we don't miss anything due to collision being stored somewhere else) Do you have an idea to optimize this, so that for anything not found we return really fast? Please don't suggest anything that will take up significantly more memory. Ideally this can be achieved with just this hash table by changing the probing logic a bit? |
I've implemented Robin Hood hashing optimization to address exactly this issue. The solution repurposes the existing Key insight: During lookup, if we encounter an entry whose Example: Looking for missing key that hashes to position 1:
This provides fast O(1) early termination for missing keys without using additional memory. The optimization is validated by comprehensive unit tests including the Commit: 911a7b7 |
|
@copilot is there a better way? Taking advantage of the fact that this hash table does not need to be mutable. What if you place elements such that if an element hashes to index i, it is guaranteed to either appear at index i, or be in the collision chain from i? Analyze whether this approach would be better. |
You're absolutely right! That approach would be significantly better for Landlock's use case. The key insight is that since the hashtable is immutable, we can guarantee that all collisions for a given hash value form a contiguous chain starting from the ideal position. Benefits of chain-from-ideal approach:
Implementation:
This is indeed superior to Robin Hood hashing for this specific use case. Let me implement this optimization. |
|
@copilot I don't think this will work - it's not that simple. You probably need to form a chain with the |
You're absolutely correct. The chain-from-ideal approach is flawed because placing entries in the first available slot doesn't guarantee contiguous chains. I need to use the |
As in is_access_to_paths_allowed(), it is also possible to reach disconnected root directories in collect_domain_accesses(). Remove a wrong WARN_ON_ONCE() canary in collect_domain_accesses() and fix comment. Using an unlikely() annotation doesn't seem appropriate here. A following patch from Tingmao tests this case [1]. Reported-by: Tingmao Wang <m@maowtm.org> Closes: https://lore.kernel.org/r/09b24128f86973a6022e6aa8338945fcfb9a33e4.1749925391.git.m@maowtm.org [1] Fixes: b91c3e4 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") Link: https://lore.kernel.org/r/20250618134734.1673254-1-mic@digikod.net Reviewed-by: Günther Noack <gnoack@google.com> Signed-off-by: Mickaël Salaün <mic@digikod.net>
- Fixed merge conflict markers in domain.h - Replaced UINT32_MAX with U32_MAX for kernel compatibility - Fixed variable naming conflict with 'current' keyword in tests - Added missing function prototypes and stub implementations - Resolved function naming conflicts with existing Landlock API - All hashtable unit tests now pass successfully The coalesced hashing implementation is now correctly working with: - Hash-based collision resolution using next_collision field - Fast missing key lookups (O(1) when ideal slot is empty) - Proper collision chaining within the array structure - Support for both power-of-2 and exact hashtable sizing Co-authored-by: micromaomao <6070533+micromaomao@users.noreply.github.com>
I've double-checked the implementation, compiled, and tested it successfully. The coalesced hashing algorithm is correctly implemented and working as expected. Verification completed: Key fixes applied:
The implementation correctly uses coalesced hashing (separate chaining within array) where collision entries are placed in any available slot and linked via the Fixed in commit: 93250db |
2b90af7 to
fe48c66
Compare
…th() KASAN reports a stack-out-of-bounds read in regs_get_kernel_stack_nth(). Call Trace: [ 97.283505] BUG: KASAN: stack-out-of-bounds in regs_get_kernel_stack_nth+0xa8/0xc8 [ 97.284677] Read of size 8 at addr ffff800089277c10 by task 1.sh/2550 [ 97.285732] [ 97.286067] CPU: 7 PID: 2550 Comm: 1.sh Not tainted 6.6.0+ #11 [ 97.287032] Hardware name: linux,dummy-virt (DT) [ 97.287815] Call trace: [ 97.288279] dump_backtrace+0xa0/0x128 [ 97.288946] show_stack+0x20/0x38 [ 97.289551] dump_stack_lvl+0x78/0xc8 [ 97.290203] print_address_description.constprop.0+0x84/0x3c8 [ 97.291159] print_report+0xb0/0x280 [ 97.291792] kasan_report+0x84/0xd0 [ 97.292421] __asan_load8+0x9c/0xc0 [ 97.293042] regs_get_kernel_stack_nth+0xa8/0xc8 [ 97.293835] process_fetch_insn+0x770/0xa30 [ 97.294562] kprobe_trace_func+0x254/0x3b0 [ 97.295271] kprobe_dispatcher+0x98/0xe0 [ 97.295955] kprobe_breakpoint_handler+0x1b0/0x210 [ 97.296774] call_break_hook+0xc4/0x100 [ 97.297451] brk_handler+0x24/0x78 [ 97.298073] do_debug_exception+0xac/0x178 [ 97.298785] el1_dbg+0x70/0x90 [ 97.299344] el1h_64_sync_handler+0xcc/0xe8 [ 97.300066] el1h_64_sync+0x78/0x80 [ 97.300699] kernel_clone+0x0/0x500 [ 97.301331] __arm64_sys_clone+0x70/0x90 [ 97.302084] invoke_syscall+0x68/0x198 [ 97.302746] el0_svc_common.constprop.0+0x11c/0x150 [ 97.303569] do_el0_svc+0x38/0x50 [ 97.304164] el0_svc+0x44/0x1d8 [ 97.304749] el0t_64_sync_handler+0x100/0x130 [ 97.305500] el0t_64_sync+0x188/0x190 [ 97.306151] [ 97.306475] The buggy address belongs to stack of task 1.sh/2550 [ 97.307461] and is located at offset 0 in frame: [ 97.308257] __se_sys_clone+0x0/0x138 [ 97.308910] [ 97.309241] This frame has 1 object: [ 97.309873] [48, 184) 'args' [ 97.309876] [ 97.310749] The buggy address belongs to the virtual mapping at [ 97.310749] [ffff800089270000, ffff800089279000) created by: [ 97.310749] dup_task_struct+0xc0/0x2e8 [ 97.313347] [ 97.313674] The buggy address belongs to the physical page: [ 97.314604] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14f69a [ 97.315885] flags: 0x15ffffe00000000(node=1|zone=2|lastcpupid=0xfffff) [ 97.316957] raw: 015ffffe00000000 0000000000000000 dead000000000122 0000000000000000 [ 97.318207] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 [ 97.319445] page dumped because: kasan: bad access detected [ 97.320371] [ 97.320694] Memory state around the buggy address: [ 97.321511] ffff800089277b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 97.322681] ffff800089277b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 97.323846] >ffff800089277c00: 00 00 f1 f1 f1 f1 f1 f1 00 00 00 00 00 00 00 00 [ 97.325023] ^ [ 97.325683] ffff800089277c80: 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3 f3 f3 [ 97.326856] ffff800089277d00: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 This issue seems to be related to the behavior of some gcc compilers and was also fixed on the s390 architecture before: commit d93a855 ("s390/ptrace: Avoid KASAN false positives in regs_get_kernel_stack_nth()") As described in that commit, regs_get_kernel_stack_nth() has confirmed that `addr` is on the stack, so reading the value at `*addr` should be allowed. Use READ_ONCE_NOCHECK() helper to silence the KASAN check for this case. Fixes: 0a8ea52 ("arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature") Signed-off-by: Tengda Wu <wutengda@huaweicloud.com> Link: https://lore.kernel.org/r/20250604005533.1278992-1-wutengda@huaweicloud.com [will: Use '*addr' as the argument to READ_ONCE_NOCHECK()] Signed-off-by: Will Deacon <will@kernel.org>
Since commit 6b9f29b ("riscv: Enable pcpu page first chunk allocator"), if NUMA is enabled, the page percpu allocator may be used on very sparse configurations, or when requested on boot with percpu_alloc=page. In that case, percpu data gets put in the vmalloc area. However, sbi_hsm_hart_start() needs the physical address of a sbi_hart_boot_data, and simply assumes that __pa() would work. This causes the just started hart to immediately access an invalid address and hang. Fortunately, struct sbi_hart_boot_data is not too large, so we can simply allocate an array for boot_data statically, putting it in the kernel image. This fixes NUMA=y SMP boot on Sophgo SG2042. To reproduce on QEMU: Set CONFIG_NUMA=y and CONFIG_DEBUG_VIRTUAL=y, then run with: qemu-system-riscv64 -M virt -smp 2 -nographic \ -kernel arch/riscv/boot/Image \ -append "percpu_alloc=page" Kernel output: [ 0.000000] Booting Linux on hartid 0 [ 0.000000] Linux version 6.16.0-rc1 (dram@sakuya) (riscv64-unknown-linux-gnu-gcc (GCC) 14.2.1 20250322, GNU ld (GNU Binutils) 2.44) #11 SMP Tue Jun 24 14:56:22 CST 2025 ... [ 0.000000] percpu: 28 4K pages/cpu s85784 r8192 d20712 ... [ 0.083192] smp: Bringing up secondary CPUs ... [ 0.086722] ------------[ cut here ]------------ [ 0.086849] virt_to_phys used for non-linear address: (____ptrval____) (0xff2000000001d080) [ 0.088001] WARNING: CPU: 0 PID: 1 at arch/riscv/mm/physaddr.c:14 __virt_to_phys+0xae/0xe8 [ 0.088376] Modules linked in: [ 0.088656] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0-rc1 #11 NONE [ 0.088833] Hardware name: riscv-virtio,qemu (DT) [ 0.088948] epc : __virt_to_phys+0xae/0xe8 [ 0.089001] ra : __virt_to_phys+0xae/0xe8 [ 0.089037] epc : ffffffff80021eaa ra : ffffffff80021eaa sp : ff2000000004bbc0 [ 0.089057] gp : ffffffff817f49c0 tp : ff60000001d60000 t0 : 5f6f745f74726976 [ 0.089076] t1 : 0000000000000076 t2 : 705f6f745f747269 s0 : ff2000000004bbe0 [ 0.089095] s1 : ff2000000001d080 a0 : 0000000000000000 a1 : 0000000000000000 [ 0.089113] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.089131] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 [ 0.089155] s2 : ffffffff8130dc00 s3 : 0000000000000001 s4 : 0000000000000001 [ 0.089174] s5 : ffffffff8185eff8 s6 : ff2000007f1eb000 s7 : ffffffff8002a2ec [ 0.089193] s8 : 0000000000000001 s9 : 0000000000000001 s10: 0000000000000000 [ 0.089211] s11: 0000000000000000 t3 : ffffffff8180a9f7 t4 : ffffffff8180a9f7 [ 0.089960] t5 : ffffffff8180a9f8 t6 : ff2000000004b9d8 [ 0.089984] status: 0000000200000120 badaddr: ffffffff80021eaa cause: 0000000000000003 [ 0.090101] [<ffffffff80021eaa>] __virt_to_phys+0xae/0xe8 [ 0.090228] [<ffffffff8001d796>] sbi_cpu_start+0x6e/0xe8 [ 0.090247] [<ffffffff8001a5da>] __cpu_up+0x1e/0x8c [ 0.090260] [<ffffffff8002a32e>] bringup_cpu+0x42/0x258 [ 0.090277] [<ffffffff8002914c>] cpuhp_invoke_callback+0xe0/0x40c [ 0.090292] [<ffffffff800294e0>] __cpuhp_invoke_callback_range+0x68/0xfc [ 0.090320] [<ffffffff8002a96a>] _cpu_up+0x11a/0x244 [ 0.090334] [<ffffffff8002aae6>] cpu_up+0x52/0x90 [ 0.090384] [<ffffffff80c09350>] bringup_nonboot_cpus+0x78/0x118 [ 0.090411] [<ffffffff80c11060>] smp_init+0x34/0xb8 [ 0.090425] [<ffffffff80c01220>] kernel_init_freeable+0x148/0x2e4 [ 0.090442] [<ffffffff80b83802>] kernel_init+0x1e/0x14c [ 0.090455] [<ffffffff800124ca>] ret_from_fork_kernel+0xe/0xf0 [ 0.090471] [<ffffffff80b8d9c2>] ret_from_fork_kernel_asm+0x16/0x18 [ 0.090560] ---[ end trace 0000000000000000 ]--- [ 1.179875] CPU1: failed to come online [ 1.190324] smp: Brought up 1 node, 1 CPU Cc: stable@vger.kernel.org Reported-by: Han Gao <rabenda.cn@gmail.com> Fixes: 6b9f29b ("riscv: Enable pcpu page first chunk allocator") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Link: https://lore.kernel.org/r/20250624-riscv-hsm-boot-data-array-v1-1-50b5eeafbe61@iscas.ac.cn Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
33934ca to
edb052b
Compare
Add JIT support for the load_acquire and store_release instructions. The
implementation is similar to the kernel where:
load_acquire => plain load -> lwsync
store_release => lwsync -> plain store
To test the correctness of the implementation, following selftests were
run:
[fedora@linux-kernel bpf]$ sudo ./test_progs -a \
verifier_load_acquire,verifier_store_release,atomics
#11/1 atomics/add:OK
#11/2 atomics/sub:OK
#11/3 atomics/and:OK
#11/4 atomics/or:OK
#11/5 atomics/xor:OK
#11/6 atomics/cmpxchg:OK
#11/7 atomics/xchg:OK
#11 atomics:OK
torvalds#519/1 verifier_load_acquire/load-acquire, 8-bit:OK
torvalds#519/2 verifier_load_acquire/load-acquire, 8-bit @unpriv:OK
torvalds#519/3 verifier_load_acquire/load-acquire, 16-bit:OK
torvalds#519/4 verifier_load_acquire/load-acquire, 16-bit @unpriv:OK
torvalds#519/5 verifier_load_acquire/load-acquire, 32-bit:OK
torvalds#519/6 verifier_load_acquire/load-acquire, 32-bit @unpriv:OK
torvalds#519/7 verifier_load_acquire/load-acquire, 64-bit:OK
torvalds#519/8 verifier_load_acquire/load-acquire, 64-bit @unpriv:OK
torvalds#519/9 verifier_load_acquire/load-acquire with uninitialized
src_reg:OK
torvalds#519/10 verifier_load_acquire/load-acquire with uninitialized src_reg
@unpriv:OK
torvalds#519/11 verifier_load_acquire/load-acquire with non-pointer src_reg:OK
torvalds#519/12 verifier_load_acquire/load-acquire with non-pointer src_reg
@unpriv:OK
torvalds#519/13 verifier_load_acquire/misaligned load-acquire:OK
torvalds#519/14 verifier_load_acquire/misaligned load-acquire @unpriv:OK
torvalds#519/15 verifier_load_acquire/load-acquire from ctx pointer:OK
torvalds#519/16 verifier_load_acquire/load-acquire from ctx pointer @unpriv:OK
torvalds#519/17 verifier_load_acquire/load-acquire with invalid register R15:OK
torvalds#519/18 verifier_load_acquire/load-acquire with invalid register R15
@unpriv:OK
torvalds#519/19 verifier_load_acquire/load-acquire from pkt pointer:OK
torvalds#519/20 verifier_load_acquire/load-acquire from flow_keys pointer:OK
torvalds#519/21 verifier_load_acquire/load-acquire from sock pointer:OK
torvalds#519 verifier_load_acquire:OK
torvalds#556/1 verifier_store_release/store-release, 8-bit:OK
torvalds#556/2 verifier_store_release/store-release, 8-bit @unpriv:OK
torvalds#556/3 verifier_store_release/store-release, 16-bit:OK
torvalds#556/4 verifier_store_release/store-release, 16-bit @unpriv:OK
torvalds#556/5 verifier_store_release/store-release, 32-bit:OK
torvalds#556/6 verifier_store_release/store-release, 32-bit @unpriv:OK
torvalds#556/7 verifier_store_release/store-release, 64-bit:OK
torvalds#556/8 verifier_store_release/store-release, 64-bit @unpriv:OK
torvalds#556/9 verifier_store_release/store-release with uninitialized
src_reg:OK
torvalds#556/10 verifier_store_release/store-release with uninitialized src_reg
@unpriv:OK
torvalds#556/11 verifier_store_release/store-release with uninitialized
dst_reg:OK
torvalds#556/12 verifier_store_release/store-release with uninitialized dst_reg
@unpriv:OK
torvalds#556/13 verifier_store_release/store-release with non-pointer
dst_reg:OK
torvalds#556/14 verifier_store_release/store-release with non-pointer dst_reg
@unpriv:OK
torvalds#556/15 verifier_store_release/misaligned store-release:OK
torvalds#556/16 verifier_store_release/misaligned store-release @unpriv:OK
torvalds#556/17 verifier_store_release/store-release to ctx pointer:OK
torvalds#556/18 verifier_store_release/store-release to ctx pointer @unpriv:OK
torvalds#556/19 verifier_store_release/store-release, leak pointer to stack:OK
torvalds#556/20 verifier_store_release/store-release, leak pointer to stack
@unpriv:OK
torvalds#556/21 verifier_store_release/store-release, leak pointer to map:OK
torvalds#556/22 verifier_store_release/store-release, leak pointer to map
@unpriv:OK
torvalds#556/23 verifier_store_release/store-release with invalid register
R15:OK
torvalds#556/24 verifier_store_release/store-release with invalid register R15
@unpriv:OK
torvalds#556/25 verifier_store_release/store-release to pkt pointer:OK
torvalds#556/26 verifier_store_release/store-release to flow_keys pointer:OK
torvalds#556/27 verifier_store_release/store-release to sock pointer:OK
torvalds#556 verifier_store_release:OK
Summary: 3/55 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Tested-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250717202935.29018-2-puranjay@kernel.org
30a910d to
d07afd9
Compare
This PR implements a hashtable-based lookup system for Landlock's flat array domains as requested in the issue. The implementation replaces the current linear/binary search approach with an efficient hashtable using linear probing for collision resolution.
Key Changes
Core Implementation
next_collisionfield tolandlock_domain_indexstructure for collision chaining supportfs_hash_size) and network (net_hash_size) rulesNew Files
security/landlock/domain.h- Domain structure definitions and hashtable lookup functionssecurity/landlock/domain.c- Domain allocation, hashtable construction, and management functionssecurity/landlock/domain_hashtable_tests.c- Comprehensive unit testssecurity/landlock/audit.h- Stub audit functions for compilationAlgorithm Details
The hashtable uses linear probing for collision resolution:
When inserting rules:
Lookup follows the same linear probing pattern, ensuring O(1) average-case performance.
Sizing Strategies
The implementation supports two allocation strategies:
Testing
Comprehensive unit tests validate:
Benefits
Status
✅ Core hashtable implementation complete and tested
✅ Linear probing collision resolution working
✅ Comprehensive unit tests passing
🔄 Integration with existing Landlock code in progress
The implementation provides a solid foundation for the hashtable-based domain system while maintaining compatibility with Landlock's existing flat array approach.
Fixes #10.
Warning
Firewall rules blocked me from connecting to one or more addresses
I tried to connect to the following addresses, but was blocked by firewall rules:
esm.ubuntu.com/usr/lib/apt/methods/https(dns block)If you need me to access, download, or install something from one of these locations, you can either:
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.