Skip to content

Commit d030925

Browse files
dmuszynsgregkh
authored andcommitted
crypto: qat - resolve race condition during AER recovery
[ Upstream commit 7d42e09 ] During the PCI AER system's error recovery process, the kernel driver may encounter a race condition with freeing the reset_data structure's memory. If the device restart will take more than 10 seconds the function scheduling that restart will exit due to a timeout, and the reset_data structure will be freed. However, this data structure is used for completion notification after the restart is completed, which leads to a UAF bug. This results in a KFENCE bug notice. BUG: KFENCE: use-after-free read in adf_device_reset_worker+0x38/0xa0 [intel_qat] Use-after-free read at 0x00000000bc56fddf (in kfence-torvalds#142): adf_device_reset_worker+0x38/0xa0 [intel_qat] process_one_work+0x173/0x340 To resolve this race condition, the memory associated to the container of the work_struct is freed on the worker if the timeout expired, otherwise on the function that schedules the worker. The timeout detection can be done by checking if the caller is still waiting for completion or not by using completion_done() function. Fixes: d8cba25 ("crypto: qat - Intel(R) QAT driver framework") Cc: <stable@vger.kernel.org> Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
1 parent 02fa834 commit d030925

File tree

1 file changed

+16
-6
lines changed

1 file changed

+16
-6
lines changed

drivers/crypto/qat/qat_common/adf_aer.c

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -95,19 +95,28 @@ static void adf_device_reset_worker(struct work_struct *work)
9595
if (adf_dev_init(accel_dev) || adf_dev_start(accel_dev)) {
9696
/* The device hanged and we can't restart it so stop here */
9797
dev_err(&GET_DEV(accel_dev), "Restart device failed\n");
98-
if (reset_data->mode == ADF_DEV_RESET_ASYNC)
98+
if (reset_data->mode == ADF_DEV_RESET_ASYNC ||
99+
completion_done(&reset_data->compl))
99100
kfree(reset_data);
100101
WARN(1, "QAT: device restart failed. Device is unusable\n");
101102
return;
102103
}
103104
adf_dev_restarted_notify(accel_dev);
104105
clear_bit(ADF_STATUS_RESTARTING, &accel_dev->status);
105106

106-
/* The dev is back alive. Notify the caller if in sync mode */
107-
if (reset_data->mode == ADF_DEV_RESET_SYNC)
108-
complete(&reset_data->compl);
109-
else
107+
/*
108+
* The dev is back alive. Notify the caller if in sync mode
109+
*
110+
* If device restart will take a more time than expected,
111+
* the schedule_reset() function can timeout and exit. This can be
112+
* detected by calling the completion_done() function. In this case
113+
* the reset_data structure needs to be freed here.
114+
*/
115+
if (reset_data->mode == ADF_DEV_RESET_ASYNC ||
116+
completion_done(&reset_data->compl))
110117
kfree(reset_data);
118+
else
119+
complete(&reset_data->compl);
111120
}
112121

113122
static int adf_dev_aer_schedule_reset(struct adf_accel_dev *accel_dev,
@@ -140,8 +149,9 @@ static int adf_dev_aer_schedule_reset(struct adf_accel_dev *accel_dev,
140149
dev_err(&GET_DEV(accel_dev),
141150
"Reset device timeout expired\n");
142151
ret = -EFAULT;
152+
} else {
153+
kfree(reset_data);
143154
}
144-
kfree(reset_data);
145155
return ret;
146156
}
147157
return 0;

0 commit comments

Comments
 (0)