-
Notifications
You must be signed in to change notification settings - Fork 58.7k
Update iommu.c #826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Update iommu.c #826
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sean-jc
added a commit
to sean-jc/linux
that referenced
this pull request
Aug 26, 2025
…lled Add a vhost_task_wake_safe() variant to handle the case where a vhost task has exited due to a signal, i.e. before being explicitly stopped by the owner of the task, and use the "safe" API in KVM when waking NX hugepage recovery tasks. This fixes a bug where KVM will attempt to wake a task that has exited, which ultimately results in all manner of badness, e.g. Oops: general protection fault, probably for non-canonical address 0xff0e899fa1566052: 0000 [#1] SMP CPU: 51 UID: 0 PID: 53807 Comm: tee Tainted: G S O 6.17.0-smp--38183c31756a-next torvalds#826 NONE Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024 RIP: 0010:queued_spin_lock_slowpath+0x123/0x250 Code: ... <48> 89 8c 02 c0 da 47 a2 83 79 08 00 75 08 f3 90 83 79 08 00 74 f8 RSP: 0018:ffffbf55cffe7cf8 EFLAGS: 00010006 RAX: ff0e899fff0e8562 RBX: 0000000000d00000 RCX: ffffa39b40aefac0 RDX: 0000000000000030 RSI: fffffffffffffff8 RDI: ffffa39d0592e68c RBP: 0000000000d00000 R08: 00000000ffffff80 R09: 0000000400000000 R10: ffffa36cce4fe401 R11: 0000000000000800 R12: 0000000000000003 R13: 0000000000000000 R14: ffffa39d0592e68c R15: ffffa39b9e672000 FS: 00007f233b2e9740(0000) GS:ffffa39b9e672000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f233b39fda0 CR3: 00000004d031f002 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> _raw_spin_lock_irqsave+0x50/0x60 try_to_wake_up+0x4f/0x5d0 set_nx_huge_pages+0xe4/0x1c0 [kvm] param_attr_store+0x89/0xf0 module_attr_store+0x1e/0x30 kernfs_fop_write_iter+0xe4/0x160 vfs_write+0x2cb/0x420 ksys_write+0x7f/0xf0 do_syscall_64+0x6f/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f233b4178b3 R13: 0000000000000002 R14: 00000000226ff3d0 R15: 0000000000000002 </TASK> Provide an API in vhost task instead of forcing KVM to solve the problem, as KVM would literally just add an equivalent to VHOST_TASK_FLAGS_KILLED, along with a new lock to protect said flag. In general, forcing simple usage of vhost task to care about signals _and_ take non-trivial action to do the right thing isn't developer friendly, and is likely to lead to similar bugs in the future. Debugged-by:Sebastian Andrzej Siewior <bigeasy@linutronix.de> Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: d96c77b ("KVM: x86: switch hugepage recovery thread to vhost_task") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
sean-jc
added a commit
to sean-jc/linux
that referenced
this pull request
Aug 26, 2025
…lled Add a vhost_task_wake_safe() variant to handle the case where a vhost task has exited due to a signal, i.e. before being explicitly stopped by the owner of the task, and use the "safe" API in KVM when waking NX hugepage recovery tasks. This fixes a bug where KVM will attempt to wake a task that has exited, which ultimately results in all manner of badness, e.g. Oops: general protection fault, probably for non-canonical address 0xff0e899fa1566052: 0000 [#1] SMP CPU: 51 UID: 0 PID: 53807 Comm: tee Tainted: G S O 6.17.0-smp--38183c31756a-next torvalds#826 NONE Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024 RIP: 0010:queued_spin_lock_slowpath+0x123/0x250 Code: ... <48> 89 8c 02 c0 da 47 a2 83 79 08 00 75 08 f3 90 83 79 08 00 74 f8 RSP: 0018:ffffbf55cffe7cf8 EFLAGS: 00010006 RAX: ff0e899fff0e8562 RBX: 0000000000d00000 RCX: ffffa39b40aefac0 RDX: 0000000000000030 RSI: fffffffffffffff8 RDI: ffffa39d0592e68c RBP: 0000000000d00000 R08: 00000000ffffff80 R09: 0000000400000000 R10: ffffa36cce4fe401 R11: 0000000000000800 R12: 0000000000000003 R13: 0000000000000000 R14: ffffa39d0592e68c R15: ffffa39b9e672000 FS: 00007f233b2e9740(0000) GS:ffffa39b9e672000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f233b39fda0 CR3: 00000004d031f002 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> _raw_spin_lock_irqsave+0x50/0x60 try_to_wake_up+0x4f/0x5d0 set_nx_huge_pages+0xe4/0x1c0 [kvm] param_attr_store+0x89/0xf0 module_attr_store+0x1e/0x30 kernfs_fop_write_iter+0xe4/0x160 vfs_write+0x2cb/0x420 ksys_write+0x7f/0xf0 do_syscall_64+0x6f/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f233b4178b3 R13: 0000000000000002 R14: 00000000226ff3d0 R15: 0000000000000002 </TASK> Provide an API in vhost task instead of forcing KVM to solve the problem, as KVM would literally just add an equivalent to VHOST_TASK_FLAGS_KILLED, along with a new lock to protect said flag. In general, forcing simple usage of vhost task to care about signals _and_ take non-trivial action to do the right thing isn't developer friendly, and is likely to lead to similar bugs in the future. Debugged-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: d96c77b ("KVM: x86: switch hugepage recovery thread to vhost_task") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
sean-jc
added a commit
to sean-jc/linux
that referenced
this pull request
Aug 26, 2025
…lled Add a vhost_task_wake_safe() variant to handle the case where a vhost task has exited due to a signal, i.e. before being explicitly stopped by the owner of the task, and use the "safe" API in KVM when waking NX hugepage recovery tasks. This fixes a bug where KVM will attempt to wake a task that has exited, which ultimately results in all manner of badness, e.g. Oops: general protection fault, probably for non-canonical address 0xff0e899fa1566052: 0000 [#1] SMP CPU: 51 UID: 0 PID: 53807 Comm: tee Tainted: G S O 6.17.0-smp--38183c31756a-next torvalds#826 NONE Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024 RIP: 0010:queued_spin_lock_slowpath+0x123/0x250 Code: ... <48> 89 8c 02 c0 da 47 a2 83 79 08 00 75 08 f3 90 83 79 08 00 74 f8 RSP: 0018:ffffbf55cffe7cf8 EFLAGS: 00010006 RAX: ff0e899fff0e8562 RBX: 0000000000d00000 RCX: ffffa39b40aefac0 RDX: 0000000000000030 RSI: fffffffffffffff8 RDI: ffffa39d0592e68c RBP: 0000000000d00000 R08: 00000000ffffff80 R09: 0000000400000000 R10: ffffa36cce4fe401 R11: 0000000000000800 R12: 0000000000000003 R13: 0000000000000000 R14: ffffa39d0592e68c R15: ffffa39b9e672000 FS: 00007f233b2e9740(0000) GS:ffffa39b9e672000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f233b39fda0 CR3: 00000004d031f002 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> _raw_spin_lock_irqsave+0x50/0x60 try_to_wake_up+0x4f/0x5d0 set_nx_huge_pages+0xe4/0x1c0 [kvm] param_attr_store+0x89/0xf0 module_attr_store+0x1e/0x30 kernfs_fop_write_iter+0xe4/0x160 vfs_write+0x2cb/0x420 ksys_write+0x7f/0xf0 do_syscall_64+0x6f/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f233b4178b3 R13: 0000000000000002 R14: 00000000226ff3d0 R15: 0000000000000002 </TASK> Provide an API in vhost task instead of forcing KVM to solve the problem, as KVM would literally just add an equivalent to VHOST_TASK_FLAGS_KILLED, along with a new lock to protect said flag. In general, forcing simple usage of vhost task to care about signals _and_ take non-trivial action to do the right thing isn't developer friendly, and is likely to lead to similar bugs in the future. Debugged-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/aKkLEtoDXKxAAWju@google.com Link: https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@google.com Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: d96c77b ("KVM: x86: switch hugepage recovery thread to vhost_task") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
sean-jc
added a commit
to sean-jc/linux
that referenced
this pull request
Aug 26, 2025
…as killed Add a vhost_task_wake_safe() variant to handle the case where a vhost task has exited due to a signal, i.e. before being explicitly stopped by the owner of the task, and use the "safe" API in KVM when waking NX hugepage recovery tasks. This fixes a bug where KVM will attempt to wake a task that has exited, which ultimately results in all manner of badness, e.g. Oops: general protection fault, probably for non-canonical address 0xff0e899fa1566052: 0000 [#1] SMP CPU: 51 UID: 0 PID: 53807 Comm: tee Tainted: G S O 6.17.0-smp--38183c31756a-next torvalds#826 NONE Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024 RIP: 0010:queued_spin_lock_slowpath+0x123/0x250 Code: ... <48> 89 8c 02 c0 da 47 a2 83 79 08 00 75 08 f3 90 83 79 08 00 74 f8 RSP: 0018:ffffbf55cffe7cf8 EFLAGS: 00010006 RAX: ff0e899fff0e8562 RBX: 0000000000d00000 RCX: ffffa39b40aefac0 RDX: 0000000000000030 RSI: fffffffffffffff8 RDI: ffffa39d0592e68c RBP: 0000000000d00000 R08: 00000000ffffff80 R09: 0000000400000000 R10: ffffa36cce4fe401 R11: 0000000000000800 R12: 0000000000000003 R13: 0000000000000000 R14: ffffa39d0592e68c R15: ffffa39b9e672000 FS: 00007f233b2e9740(0000) GS:ffffa39b9e672000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f233b39fda0 CR3: 00000004d031f002 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> _raw_spin_lock_irqsave+0x50/0x60 try_to_wake_up+0x4f/0x5d0 set_nx_huge_pages+0xe4/0x1c0 [kvm] param_attr_store+0x89/0xf0 module_attr_store+0x1e/0x30 kernfs_fop_write_iter+0xe4/0x160 vfs_write+0x2cb/0x420 ksys_write+0x7f/0xf0 do_syscall_64+0x6f/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f233b4178b3 R13: 0000000000000002 R14: 00000000226ff3d0 R15: 0000000000000002 </TASK> Provide an API in vhost task instead of forcing KVM to solve the problem, as KVM would literally just add an equivalent to VHOST_TASK_FLAGS_KILLED, along with a new lock to protect said flag. In general, forcing simple usage of vhost task to care about signals _and_ take non-trivial action to do the right thing isn't developer friendly, and is likely to lead to similar bugs in the future. Debugged-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/aKkLEtoDXKxAAWju@google.com Link: https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@google.com Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: d96c77b ("KVM: x86: switch hugepage recovery thread to vhost_task") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Aug 26, 2025
…as killed Add a vhost_task_wake_safe() variant to handle the case where a vhost task has exited due to a signal, i.e. before being explicitly stopped by the owner of the task, and use the "safe" API in KVM when waking NX hugepage recovery tasks. This fixes a bug where KVM will attempt to wake a task that has exited, which ultimately results in all manner of badness, e.g. Oops: general protection fault, probably for non-canonical address 0xff0e899fa1566052: 0000 [#1] SMP CPU: 51 UID: 0 PID: 53807 Comm: tee Tainted: G S O 6.17.0-smp--38183c31756a-next torvalds#826 NONE Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024 RIP: 0010:queued_spin_lock_slowpath+0x123/0x250 Code: ... <48> 89 8c 02 c0 da 47 a2 83 79 08 00 75 08 f3 90 83 79 08 00 74 f8 RSP: 0018:ffffbf55cffe7cf8 EFLAGS: 00010006 RAX: ff0e899fff0e8562 RBX: 0000000000d00000 RCX: ffffa39b40aefac0 RDX: 0000000000000030 RSI: fffffffffffffff8 RDI: ffffa39d0592e68c RBP: 0000000000d00000 R08: 00000000ffffff80 R09: 0000000400000000 R10: ffffa36cce4fe401 R11: 0000000000000800 R12: 0000000000000003 R13: 0000000000000000 R14: ffffa39d0592e68c R15: ffffa39b9e672000 FS: 00007f233b2e9740(0000) GS:ffffa39b9e672000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f233b39fda0 CR3: 00000004d031f002 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> _raw_spin_lock_irqsave+0x50/0x60 try_to_wake_up+0x4f/0x5d0 set_nx_huge_pages+0xe4/0x1c0 [kvm] param_attr_store+0x89/0xf0 module_attr_store+0x1e/0x30 kernfs_fop_write_iter+0xe4/0x160 vfs_write+0x2cb/0x420 ksys_write+0x7f/0xf0 do_syscall_64+0x6f/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f233b4178b3 R13: 0000000000000002 R14: 00000000226ff3d0 R15: 0000000000000002 </TASK> Provide an API in vhost task instead of forcing KVM to solve the problem, as KVM would literally just add an equivalent to VHOST_TASK_FLAGS_KILLED, along with a new lock to protect said flag. In general, forcing simple usage of vhost task to care about signals _and_ take non-trivial action to do the right thing isn't developer friendly, and is likely to lead to similar bugs in the future. Debugged-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/aKkLEtoDXKxAAWju@google.com Link: https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@google.com Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: d96c77b ("KVM: x86: switch hugepage recovery thread to vhost_task") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
sean-jc
added a commit
to sean-jc/linux
that referenced
this pull request
Aug 26, 2025
…lled Make the "default" API for waking a vhost task safe against the underlying task exiting due to a fatal signal. This fixes a bug in KVM x86 where KVM attempts to wake an NX hugepage recovery task that exiting before being explicitly stopped, resulting in a use-after-free and thus crashes, hangs, and other badness. Oops: general protection fault, probably for non-canonical address 0xff0e899fa1566052: 0000 [#1] SMP CPU: 51 UID: 0 PID: 53807 Comm: tee Tainted: G S O 6.17.0-smp--38183c31756a-next torvalds#826 NONE Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024 RIP: 0010:queued_spin_lock_slowpath+0x123/0x250 Code: ... <48> 89 8c 02 c0 da 47 a2 83 79 08 00 75 08 f3 90 83 79 08 00 74 f8 RSP: 0018:ffffbf55cffe7cf8 EFLAGS: 00010006 RAX: ff0e899fff0e8562 RBX: 0000000000d00000 RCX: ffffa39b40aefac0 RDX: 0000000000000030 RSI: fffffffffffffff8 RDI: ffffa39d0592e68c RBP: 0000000000d00000 R08: 00000000ffffff80 R09: 0000000400000000 R10: ffffa36cce4fe401 R11: 0000000000000800 R12: 0000000000000003 R13: 0000000000000000 R14: ffffa39d0592e68c R15: ffffa39b9e672000 FS: 00007f233b2e9740(0000) GS:ffffa39b9e672000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f233b39fda0 CR3: 00000004d031f002 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> _raw_spin_lock_irqsave+0x50/0x60 try_to_wake_up+0x4f/0x5d0 set_nx_huge_pages+0xe4/0x1c0 [kvm] param_attr_store+0x89/0xf0 module_attr_store+0x1e/0x30 kernfs_fop_write_iter+0xe4/0x160 vfs_write+0x2cb/0x420 ksys_write+0x7f/0xf0 do_syscall_64+0x6f/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f233b4178b3 R13: 0000000000000002 R14: 00000000226ff3d0 R15: 0000000000000002 </TASK> Handle VHOST_TASK_FLAGS_KILLED in vhost_task_wake() instead of forcing KVM to solve the problem, as KVM would literally just add an equivalent flag, along with a new lock to protect said flag. In general, forcing simple usage of vhost task to care about signals _and_ take non-trivial action to do the right thing isn't developer friendly, and is likely to lead to similar bugs in the future. Keep the existing behavior for vhost (by calling __vhost_task_wake() instead of vhost_task_wake()), as vhost_worker_killed() takes extra care to stop and flush all workers, i.e. doesn't need the extra protection, and because vhost_vq_work_queue() calls vhost_worker_queue() | -> worker->ops->wakeup(worker) | -> vhost_task_wakeup() | -> vhost_task_wake() while holding RCU and so can't sleep, i.e. can't take exit_mutex. rcu_read_lock(); worker = rcu_dereference(vq->worker); if (worker) { queued = true; vhost_worker_queue(worker, work); } rcu_read_unlock(); Debugged-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/aKkLEtoDXKxAAWju@google.com Link: https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@google.com Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: d96c77b ("KVM: x86: switch hugepage recovery thread to vhost_task") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Aug 27, 2025
…lled Make the "default" API for waking a vhost task safe against the underlying task exiting due to a fatal signal. This fixes a bug in KVM x86 where KVM attempts to wake an NX hugepage recovery task that exiting before being explicitly stopped, resulting in a use-after-free and thus crashes, hangs, and other badness. Oops: general protection fault, probably for non-canonical address 0xff0e899fa1566052: 0000 [#1] SMP CPU: 51 UID: 0 PID: 53807 Comm: tee Tainted: G S O 6.17.0-smp--38183c31756a-next torvalds#826 NONE Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024 RIP: 0010:queued_spin_lock_slowpath+0x123/0x250 Code: ... <48> 89 8c 02 c0 da 47 a2 83 79 08 00 75 08 f3 90 83 79 08 00 74 f8 RSP: 0018:ffffbf55cffe7cf8 EFLAGS: 00010006 RAX: ff0e899fff0e8562 RBX: 0000000000d00000 RCX: ffffa39b40aefac0 RDX: 0000000000000030 RSI: fffffffffffffff8 RDI: ffffa39d0592e68c RBP: 0000000000d00000 R08: 00000000ffffff80 R09: 0000000400000000 R10: ffffa36cce4fe401 R11: 0000000000000800 R12: 0000000000000003 R13: 0000000000000000 R14: ffffa39d0592e68c R15: ffffa39b9e672000 FS: 00007f233b2e9740(0000) GS:ffffa39b9e672000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f233b39fda0 CR3: 00000004d031f002 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> _raw_spin_lock_irqsave+0x50/0x60 try_to_wake_up+0x4f/0x5d0 set_nx_huge_pages+0xe4/0x1c0 [kvm] param_attr_store+0x89/0xf0 module_attr_store+0x1e/0x30 kernfs_fop_write_iter+0xe4/0x160 vfs_write+0x2cb/0x420 ksys_write+0x7f/0xf0 do_syscall_64+0x6f/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f233b4178b3 R13: 0000000000000002 R14: 00000000226ff3d0 R15: 0000000000000002 </TASK> Handle VHOST_TASK_FLAGS_KILLED in vhost_task_wake() instead of forcing KVM to solve the problem, as KVM would literally just add an equivalent flag, along with a new lock to protect said flag. In general, forcing simple usage of vhost task to care about signals _and_ take non-trivial action to do the right thing isn't developer friendly, and is likely to lead to similar bugs in the future. Keep the existing behavior for vhost (by calling __vhost_task_wake() instead of vhost_task_wake()), as vhost_worker_killed() takes extra care to stop and flush all workers, i.e. doesn't need the extra protection, and because vhost_vq_work_queue() calls vhost_worker_queue() | -> worker->ops->wakeup(worker) | -> vhost_task_wakeup() | -> vhost_task_wake() while holding RCU and so can't sleep, i.e. can't take exit_mutex. rcu_read_lock(); worker = rcu_dereference(vq->worker); if (worker) { queued = true; vhost_worker_queue(worker, work); } rcu_read_unlock(); Debugged-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/aKkLEtoDXKxAAWju@google.com Link: https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@google.com Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: d96c77b ("KVM: x86: switch hugepage recovery thread to vhost_task") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 17, 2025
…lled Make the "default" API for waking a vhost task safe against the underlying task exiting due to a fatal signal. This fixes a bug in KVM x86 where KVM attempts to wake an NX hugepage recovery task that exiting before being explicitly stopped, resulting in a use-after-free and thus crashes, hangs, and other badness. Oops: general protection fault, probably for non-canonical address 0xff0e899fa1566052: 0000 [#1] SMP CPU: 51 UID: 0 PID: 53807 Comm: tee Tainted: G S O 6.17.0-smp--38183c31756a-next torvalds#826 NONE Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024 RIP: 0010:queued_spin_lock_slowpath+0x123/0x250 Code: ... <48> 89 8c 02 c0 da 47 a2 83 79 08 00 75 08 f3 90 83 79 08 00 74 f8 RSP: 0018:ffffbf55cffe7cf8 EFLAGS: 00010006 RAX: ff0e899fff0e8562 RBX: 0000000000d00000 RCX: ffffa39b40aefac0 RDX: 0000000000000030 RSI: fffffffffffffff8 RDI: ffffa39d0592e68c RBP: 0000000000d00000 R08: 00000000ffffff80 R09: 0000000400000000 R10: ffffa36cce4fe401 R11: 0000000000000800 R12: 0000000000000003 R13: 0000000000000000 R14: ffffa39d0592e68c R15: ffffa39b9e672000 FS: 00007f233b2e9740(0000) GS:ffffa39b9e672000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f233b39fda0 CR3: 00000004d031f002 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> _raw_spin_lock_irqsave+0x50/0x60 try_to_wake_up+0x4f/0x5d0 set_nx_huge_pages+0xe4/0x1c0 [kvm] param_attr_store+0x89/0xf0 module_attr_store+0x1e/0x30 kernfs_fop_write_iter+0xe4/0x160 vfs_write+0x2cb/0x420 ksys_write+0x7f/0xf0 do_syscall_64+0x6f/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f233b4178b3 R13: 0000000000000002 R14: 00000000226ff3d0 R15: 0000000000000002 </TASK> Handle VHOST_TASK_FLAGS_KILLED in vhost_task_wake() instead of forcing KVM to solve the problem, as KVM would literally just add an equivalent flag, along with a new lock to protect said flag. In general, forcing simple usage of vhost task to care about signals _and_ take non-trivial action to do the right thing isn't developer friendly, and is likely to lead to similar bugs in the future. Keep the existing behavior for vhost (by calling __vhost_task_wake() instead of vhost_task_wake()), as vhost_worker_killed() takes extra care to stop and flush all workers, i.e. doesn't need the extra protection, and because vhost_vq_work_queue() calls vhost_worker_queue() | -> worker->ops->wakeup(worker) | -> vhost_task_wakeup() | -> vhost_task_wake() while holding RCU and so can't sleep, i.e. can't take exit_mutex. rcu_read_lock(); worker = rcu_dereference(vq->worker); if (worker) { queued = true; vhost_worker_queue(worker, work); } rcu_read_unlock(); Debugged-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/aKkLEtoDXKxAAWju@google.com Link: https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@google.com Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: d96c77b ("KVM: x86: switch hugepage recovery thread to vhost_task") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20250827194107.4142164-2-seanjc@google.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>
sean-jc
added a commit
to sean-jc/linux
that referenced
this pull request
Nov 11, 2025
…lled Make the "default" API for waking a vhost task safe against the underlying task exiting due to a fatal signal. This fixes a bug in KVM x86 where KVM attempts to wake an NX hugepage recovery task that exiting before being explicitly stopped, resulting in a use-after-free and thus crashes, hangs, and other badness. Oops: general protection fault, probably for non-canonical address 0xff0e899fa1566052: 0000 [#1] SMP CPU: 51 UID: 0 PID: 53807 Comm: tee Tainted: G S O 6.17.0-smp--38183c31756a-next torvalds#826 NONE Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024 RIP: 0010:queued_spin_lock_slowpath+0x123/0x250 Code: ... <48> 89 8c 02 c0 da 47 a2 83 79 08 00 75 08 f3 90 83 79 08 00 74 f8 RSP: 0018:ffffbf55cffe7cf8 EFLAGS: 00010006 RAX: ff0e899fff0e8562 RBX: 0000000000d00000 RCX: ffffa39b40aefac0 RDX: 0000000000000030 RSI: fffffffffffffff8 RDI: ffffa39d0592e68c RBP: 0000000000d00000 R08: 00000000ffffff80 R09: 0000000400000000 R10: ffffa36cce4fe401 R11: 0000000000000800 R12: 0000000000000003 R13: 0000000000000000 R14: ffffa39d0592e68c R15: ffffa39b9e672000 FS: 00007f233b2e9740(0000) GS:ffffa39b9e672000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f233b39fda0 CR3: 00000004d031f002 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> _raw_spin_lock_irqsave+0x50/0x60 try_to_wake_up+0x4f/0x5d0 set_nx_huge_pages+0xe4/0x1c0 [kvm] param_attr_store+0x89/0xf0 module_attr_store+0x1e/0x30 kernfs_fop_write_iter+0xe4/0x160 vfs_write+0x2cb/0x420 ksys_write+0x7f/0xf0 do_syscall_64+0x6f/0x1f0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f233b4178b3 R13: 0000000000000002 R14: 00000000226ff3d0 R15: 0000000000000002 </TASK> Handle VHOST_TASK_FLAGS_KILLED in vhost_task_wake() instead of forcing KVM to solve the problem, as KVM would literally just add an equivalent flag, along with a new lock to protect said flag. In general, forcing simple usage of vhost task to care about signals _and_ take non-trivial action to do the right thing isn't developer friendly, and is likely to lead to similar bugs in the future. Keep the existing behavior for vhost (by calling __vhost_task_wake() instead of vhost_task_wake()), as vhost_worker_killed() takes extra care to stop and flush all workers, i.e. doesn't need the extra protection, and because vhost_vq_work_queue() calls vhost_worker_queue() | -> worker->ops->wakeup(worker) | -> vhost_task_wakeup() | -> vhost_task_wake() while holding RCU and so can't sleep, i.e. can't take exit_mutex. rcu_read_lock(); worker = rcu_dereference(vq->worker); if (worker) { queued = true; vhost_worker_queue(worker, work); } rcu_read_unlock(); Debugged-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/aKkLEtoDXKxAAWju@google.com Link: https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@google.com Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fixes: d96c77b ("KVM: x86: switch hugepage recovery thread to vhost_task") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add additional dev check