Commit 5ab4eba
drm/xe: Process deferred GGTT node removals on device unwind
[ Upstream commit af2b588 ]
While we are indirectly draining our dedicated workqueue ggtt->wq
that we use to complete asynchronous removal of some GGTT nodes,
this happends as part of the managed-drm unwinding (ggtt_fini_early),
which could be later then manage-device unwinding, where we could
already unmap our MMIO/GMS mapping (mmio_fini).
This was recently observed during unsuccessful VF initialization:
[ ] xe 0000:00:02.1: probe with driver xe failed with error -62
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e747340 __xe_bo_unpin_map_no_vm (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e747540 __xe_bo_unpin_map_no_vm (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e747240 __xe_bo_unpin_map_no_vm (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e747040 tiles_fini (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e746840 mmio_fini (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e747f40 xe_bo_pinned_fini (16 bytes)
[ ] xe 0000:00:02.1: DEVRES REL ffff88811e746b40 devm_drm_dev_init_release (16 bytes)
[ ] xe 0000:00:02.1: [drm:drm_managed_release] drmres release begin
[ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef81640 __fini_relay (8 bytes)
[ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80d40 guc_ct_fini (8 bytes)
[ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80040 __drmm_mutex_release (8 bytes)
[ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80140 ggtt_fini_early (8 bytes)
and this was leading to:
[ ] BUG: unable to handle page fault for address: ffffc900058162a0
[ ] #PF: supervisor write access in kernel mode
[ ] #PF: error_code(0x0002) - not-present page
[ ] Oops: Oops: 0002 [#1] SMP NOPTI
[ ] Tainted: [W]=WARN
[ ] Workqueue: xe-ggtt-wq ggtt_node_remove_work_func [xe]
[ ] RIP: 0010:xe_ggtt_set_pte+0x6d/0x350 [xe]
[ ] Call Trace:
[ ] <TASK>
[ ] xe_ggtt_clear+0xb0/0x270 [xe]
[ ] ggtt_node_remove+0xbb/0x120 [xe]
[ ] ggtt_node_remove_work_func+0x30/0x50 [xe]
[ ] process_one_work+0x22b/0x6f0
[ ] worker_thread+0x1e8/0x3d
Add managed-device action that will explicitly drain the workqueue
with all pending node removals prior to releasing MMIO/GSM mapping.
Fixes: 919bb54 ("drm/xe: Fix missing runtime outer protection for ggtt_remove_node")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250612220937.857-2-michal.wajdeczko@intel.com
(cherry picked from commit 89d2835)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>1 parent f161e90 commit 5ab4eba
1 file changed
+11
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
201 | 201 | | |
202 | 202 | | |
203 | 203 | | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
204 | 211 | | |
205 | 212 | | |
206 | 213 | | |
| |||
257 | 264 | | |
258 | 265 | | |
259 | 266 | | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
260 | 271 | | |
261 | 272 | | |
262 | 273 | | |
| |||
0 commit comments