Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for qpnp-qg fuel gauge on Fairphone3 #117

Open
spasswolf opened this issue Jul 16, 2023 · 4 comments
Open

Initial support for qpnp-qg fuel gauge on Fairphone3 #117

spasswolf opened this issue Jul 16, 2023 · 4 comments

Comments

@spasswolf
Copy link

spasswolf commented Jul 16, 2023

I've pushed an early version of a driver for the fuel gauge on the fairphone3 (earlier hardware version) to the branch msm8953_v6.4_gpu_wlan_modem_ipa_cpufreq_qg of https://github.com/spasswolf/msm8953-linux.git.

commit a46ef5194c8f424bc6a337d203fc10978021896c (HEAD -> msm8953_v6.4_gpu_wlan_modem_ipa_cpufreq_qg, msm8953-linux/msm8953_v6.4_gpu_wlan_modem_ipa_cpufreq_qg)
Author: Bert Karwatzki <spasswolf@web.de>
Date:   Sun Jul 16 14:38:55 2023 +0200

    power: suppyly: qcom_qg: Initial qpnp-qg fuel gauge support.
    
    This commit adds initial support for the qpnp-qg fuel gauge based
    partly on the downstream driver[1]. So far voltage and current measurement
    work and the capacity is estimated using lookup tables adapted from
    the downstream kernel[1]. Interrupt support has not yet been added.
    
    [1]: https://code.fairphone.com/projects/fairphone-3/gpl.html
    
    Signed-off-by: Bert Karwatzki <spasswolf@web.de>

commit 8396191d3bf2267549e5fb3965366e1ef5e31dae
Author: Bert Karwatzki <spasswolf@web.de>
Date:   Sun Jul 16 15:18:06 2023 +0200

    drivers: iio: adc: qcom-spmi-adc5: Add bat_therm channel.
    
    The qpnp-qg fuel gauge driver in the downstream kernel [1] for the
    fairphone-fp3 uses this channel to read the battery temperature.
    
    [1] https://code.fairphone.com/projects/fairphone-3/gpl.html
    
    Signed-off-by: Bert Karwatzki <spasswolf@web.de>

What works is current and voltage measuring and capacity estimation:

$ cat /sys/class/power_supply/qcom-battery/uevent 
POWER_SUPPLY_NAME=qcom-battery
POWER_SUPPLY_TYPE=Battery
POWER_SUPPLY_TECHNOLOGY=Li-ion
POWER_SUPPLY_CAPACITY=94
POWER_SUPPLY_CURRENT_NOW=-344848
POWER_SUPPLY_VOLTAGE_NOW=4300504
POWER_SUPPLY_VOLTAGE_MIN_DESIGN=3400000
POWER_SUPPLY_VOLTAGE_MAX_DESIGN=4390000
POWER_SUPPLY_CHARGE_FULL_DESIGN=3056000
POWER_SUPPLY_TEMP=33186

Voltage is in µV and current in µA, negative current seems to show that we're charging (which seems to work even without charger support in kernel).

Edit: Capacity estimation seems to be off by about 3 percentage points when charging, perhaps because the measured voltage is used directly without correcting for internal resistance and because the lookup table used are those for the non-charging case.

@z3ntu
Copy link

z3ntu commented Jul 17, 2023

Wow, that's amazing!

One sidenote I wanna note that I've seen, for the bat_therm channel is that normally downstream qpnp-vadc driver has the hardcoded NTC (Negative Temperature Coefficient) lookup table replaced by a list for the resistor that is actually found in the battery. No clue how that would be solvable in mainline since they also just have the LUT (lookup table) hardcoded in the driver.

In the best case I think you'd be checking the id resistor first to identify one of multiple possible batteries (which are defined in dt), then for this battery you take the battery data and whatnot including the thermistor data which would then get used.

Also one more question for now, did you see any code downstream specific to pm7250b (used for FG in FP4) since that also uses the same driver (in theory at least)?

@spasswolf
Copy link
Author

Just looked qpnp-qg.c in the fairhpone-fp4 kernel. Things I noticed:
(1) Instead of qpnp_vadc_read the iio_channel_read_processed functions are used.
(2) There is a lot of conditionally compiled code for PM7250

$ grep '[^!]defined.*CONFIG_TCT_PM7250_COMMON' -r fp4-kernel/kernel/msm-4.19/drivers/power/supply/qcom/ | wc -l
226

@spasswolf
Copy link
Author

spasswolf commented Jul 20, 2023

To get a better estimate of the internal resistance I tried to use the method from qpnp-qg.c. Unfortunately the method seems to be inaccurate and slow here's the code which I do not really want to push yet.

diff --git a/drivers/power/supply/power_supply_sysfs.c b/drivers/power/supply/power_supply_sysfs.c
index 06e5b6b0e255..ad1d8b53d9e7 100644
--- a/drivers/power/supply/power_supply_sysfs.c
+++ b/drivers/power/supply/power_supply_sysfs.c
@@ -216,6 +216,7 @@ static struct power_supply_attr power_supply_attrs[] = {
 	POWER_SUPPLY_ATTR(MANUFACTURE_YEAR),
 	POWER_SUPPLY_ATTR(MANUFACTURE_MONTH),
 	POWER_SUPPLY_ATTR(MANUFACTURE_DAY),
+	POWER_SUPPLY_ATTR(ESR),
 	/* Properties of type `const char *' */
 	POWER_SUPPLY_ATTR(MODEL_NAME),
 	POWER_SUPPLY_ATTR(MANUFACTURER),
diff --git a/drivers/power/supply/qcom_qg.c b/drivers/power/supply/qcom_qg.c
index 15b530331a4c..e1e783943d3b 100644
--- a/drivers/power/supply/qcom_qg.c
+++ b/drivers/power/supply/qcom_qg.c
@@ -176,6 +176,7 @@ struct qcom_qg_chip;
 struct qcom_qg_ops {
 	int (*get_capacity)(struct qcom_qg_chip *chip, int *);
 	int (*get_temperature)(struct qcom_qg_chip *chip, int *);
+	int (*get_esr)(struct qcom_qg_chip *chip, int *);
 	int (*get_current)(struct qcom_qg_chip *chip, int *);
 	int (*get_voltage)(struct qcom_qg_chip *chip, int *);
 };
@@ -202,10 +203,15 @@ struct qcom_qg_chip {
 	/* No charger support, yet. */
 	struct power_supply *chg_psy;
 	struct mutex bus_lock;
+	// equivalent series resistance of batteryin mOhm, taken from
+	// device tree on probe but measured as in qpnp-qg.c for capacity
+	// calculation
+	int esr;
 	bool battery_missing;
 	int status;
 };
 
+
 /* from qg-util.c */
 static inline bool is_sticky_register(u32 addr)
 {
@@ -240,6 +246,23 @@ static int qcom_qg_read(struct qcom_qg_chip *chip, u32 addr, u8 *val, int len)
 	return 0;
 }
 
+/* from qg-util.c */
+int qcom_qg_read_raw_data(struct qcom_qg_chip *chip, int addr, u32 *data)
+{
+	int ret;
+	u8 reg[2] = {0};
+
+	ret = qcom_qg_read(chip, chip->qg_base + addr, &reg[0], 2);
+	if (ret < 0) {
+		dev_err(chip->dev, "Failed to read QG addr %d rc=%d\n", addr, ret);
+		return ret;
+	}
+
+	*data = reg[0] | (reg[1] << 8);
+
+	return ret;
+}
+
 /* from qg-util.c */
 static int qcom_qg_write(struct qcom_qg_chip *chip, u32 addr, u8 *val, int len)
 {
@@ -276,6 +299,27 @@ static int qcom_qg_masked_write(struct qcom_qg_chip *chip, int addr, u32 mask, u
 	return rc;
 }
 
+/* from qg-util.c */
+static int qcom_qg_master_hold(struct qcom_qg_chip *chip, bool hold)
+{
+	int rc;
+
+	/* clear the master */
+	rc = qcom_qg_masked_write(chip, chip->qg_base + QG_DATA_CTL1_REG,
+					MASTER_HOLD_OR_CLR_BIT, 0);
+	if (rc < 0)
+		return rc;
+
+	if (hold) {
+		/* 0 -> 1, hold the master */
+		rc = qcom_qg_masked_write(chip, chip->qg_base + QG_DATA_CTL1_REG,
+					MASTER_HOLD_OR_CLR_BIT,
+					MASTER_HOLD_OR_CLR_BIT);
+	}
+
+	return rc;
+}
+
 static int qcom_qg_get_current(struct qcom_qg_chip *chip, int *ibat_ua)
 {
 	int rc = 0, last_ibat = 0;
@@ -335,6 +379,131 @@ static int qcom_qg_get_voltage(struct qcom_qg_chip *chip, int *vbat_uv)
 	return rc;
 }
 
+#define MAX_ESR_RETRY_COUNT		10
+static int qcom_qg_get_esr(struct qcom_qg_chip *chip, int *esr)
+{
+	int ret, i, count, esr_sample_count = 3;
+	u8 esr_done_count, reg0, reg1;
+	u32 pre_esr_v[QG_MAX_ESR_COUNT], pre_esr_i[QG_MAX_ESR_COUNT],
+	    post_esr_v[QG_MAX_ESR_COUNT], post_esr_i[QG_MAX_ESR_COUNT];
+	int diff_i, diff_v, esr_sample[QG_MAX_ESR_COUNT], esr_avg = 0, esr_avg_2 = 0, actual_sample_count = 0;
+
+	ret = qcom_qg_master_hold(chip, true);
+	if (ret < 0) {
+		return ret;
+	}
+
+	for(i = 0; i < esr_sample_count; i++) {
+		ret = qcom_qg_masked_write(chip,
+			chip->qg_base + QG_ESR_MEAS_TRIG_REG,
+			HW_ESR_MEAS_START_BIT, HW_ESR_MEAS_START_BIT);
+		if (ret < 0) {
+			continue;
+		}
+
+		esr_done_count = reg0 = reg1 = 0;
+		do {
+			/* delay for ESR processing to complete */
+			msleep(50);
+
+			esr_done_count++;
+
+			ret = qcom_qg_read(chip,
+				chip->qg_base + QG_STATUS1_REG, &reg0, 1);
+			if (ret < 0)
+				continue;
+
+			ret = qcom_qg_read(chip,
+				chip->qg_base + QG_STATUS4_REG, &reg1, 1);
+			if (ret < 0)
+				continue;
+
+			/* check ESR-done status */
+			if (!(reg1 & ESR_MEAS_IN_PROGRESS_BIT) &&
+					(reg0 & ESR_MEAS_DONE_BIT)) {
+				break;
+			}
+		} while (esr_done_count < MAX_ESR_RETRY_COUNT);
+
+		if (esr_done_count == MAX_ESR_RETRY_COUNT) {
+			dev_info(chip->dev, "Failed to get esr in %d iterations", esr_done_count);
+			msleep(100);
+			continue;
+		}
+
+		/* found a valid ESR, read pre-post data */
+		ret = qcom_qg_read_raw_data(chip, QG_PRE_ESR_V_DATA0_REG, &pre_esr_v[i]);
+		if (ret < 0) {
+			continue;
+		}
+		pre_esr_v[i] = V_RAW_TO_UV(pre_esr_v[i]);
+
+		ret = qcom_qg_read_raw_data(chip, QG_PRE_ESR_I_DATA0_REG, &pre_esr_i[i]);
+		if (ret < 0) {
+			continue;
+		}
+		pre_esr_i[i] = I_RAW_TO_UA(sign_extend32(pre_esr_i[i], 15));
+
+		ret = qcom_qg_read_raw_data(chip, QG_POST_ESR_V_DATA0_REG, &post_esr_v[i]);
+		if (ret < 0) {
+			continue;
+		}
+		post_esr_v[i] = V_RAW_TO_UV(post_esr_v[i]);
+
+		ret = qcom_qg_read_raw_data(chip, QG_POST_ESR_I_DATA0_REG, &post_esr_i[i]);
+		if (ret < 0) {
+			continue;
+		}
+		post_esr_i[i] = I_RAW_TO_UA(sign_extend32(post_esr_i[i], 15));
+
+		diff_v = abs(post_esr_v[i] - pre_esr_v[i]);
+		diff_i = abs(post_esr_i[i] - pre_esr_i[i]);
+		if (diff_i == 0) {
+			*esr = 0;
+			dev_info(chip->dev, "diff_i == 0, cannot calculate esr\n");
+			continue;
+		}
+
+		esr_sample[i] = DIV_ROUND_CLOSEST(diff_v * 1000, diff_i);
+		dev_info(chip->dev, "esr_sample[%d] = %d\n", i, esr_sample[i]);
+		esr_avg += esr_sample[i];
+		actual_sample_count++;
+		msleep(100);
+	}
+
+	if (actual_sample_count == 0) {
+		dev_info(chip->dev, "not enought samples to calculate esr\n");
+		return -EINVAL;
+	}
+
+	esr_avg = esr_avg / actual_sample_count;
+
+	// calculate a second average where we only take samples
+	// withing 10% of the original average
+	count = 0;
+	for(i = 0; i < actual_sample_count; i++) {
+		if (abs(esr_sample[i] - esr_avg) <= esr_avg / 10) {
+			esr_avg_2 += esr_sample[i];
+			count++;
+		} 
+	}
+
+	if (count == 0) {
+		dev_info(chip->dev, "not enough samples close enough to average to calculate esr\n");
+		return -EINVAL;
+	}
+
+	chip->esr = esr_avg_2 / count;
+
+	*esr = chip->esr;
+
+	ret = qcom_qg_master_hold(chip, false);
+	if (ret < 0) {
+		return ret;
+	}
+	return 0;
+}
+
 static int qcom_qg_get_temperature(struct qcom_qg_chip *chip, int *temp)
 {
 	int ret;
@@ -380,6 +549,7 @@ static int qcom_qg_get_capacity(struct qcom_qg_chip *chip, int *soc)
 static const struct qcom_qg_ops ops_qg = {
 	.get_capacity = qcom_qg_get_capacity,
 	.get_temperature = qcom_qg_get_temperature,
+	.get_esr = qcom_qg_get_esr,
 	.get_current = qcom_qg_get_current,
 	.get_voltage = qcom_qg_get_voltage,
 };
@@ -409,6 +579,7 @@ static enum power_supply_property qcom_qg_props[] = {
 	POWER_SUPPLY_PROP_VOLTAGE_MAX_DESIGN,
 	POWER_SUPPLY_PROP_CHARGE_FULL_DESIGN,
 	POWER_SUPPLY_PROP_TEMP,
+	POWER_SUPPLY_PROP_ESR,
 };
 
 static int qcom_qg_get_property(struct power_supply *psy,
@@ -445,6 +616,9 @@ static int qcom_qg_get_property(struct power_supply *psy,
 	case POWER_SUPPLY_PROP_TEMP:
 		ret = chip->ops->get_temperature(chip, &val->intval);
 		break;
+	case POWER_SUPPLY_PROP_ESR:
+		ret = chip->ops->get_esr(chip, &val->intval);
+		break;
 	default:
 		dev_err(chip->dev, "invalid property: %d\n", psp);
 		return -EINVAL;
@@ -532,6 +706,9 @@ static int qcom_qg_probe(struct platform_device *pdev)
 		return ret;
 	}
 
+	// set esr from devicetree, esr in mOhm
+	chip->esr = chip->batt_info->factory_internal_resistance_uohm / 1000;
+
 	chip->nvmem = devm_nvmem_device_get(chip->dev, "pmi632_sdam_2");
 	if (IS_ERR(chip->nvmem)) {
 		dev_err(chip->dev, "Failed to get nvmem\n");
diff --git a/include/linux/power_supply.h b/include/linux/power_supply.h
index a427f13c757f..fd9dc5015a55 100644
--- a/include/linux/power_supply.h
+++ b/include/linux/power_supply.h
@@ -170,6 +170,7 @@ enum power_supply_property {
 	POWER_SUPPLY_PROP_MANUFACTURE_YEAR,
 	POWER_SUPPLY_PROP_MANUFACTURE_MONTH,
 	POWER_SUPPLY_PROP_MANUFACTURE_DAY,
+	POWER_SUPPLY_PROP_ESR,
 	/* Properties of type `const char *' */
 	POWER_SUPPLY_PROP_MODEL_NAME,
 	POWER_SUPPLY_PROP_MANUFACTURER,

It usually it little smaller resistance than my initial guess of 125mOhm:

POWER_SUPPLY_NAME=qcom-battery
POWER_SUPPLY_TYPE=Battery
POWER_SUPPLY_TECHNOLOGY=Li-ion
POWER_SUPPLY_CAPACITY=93
POWER_SUPPLY_CURRENT_NOW=-293884
POWER_SUPPLY_VOLTAGE_NOW=4323277
POWER_SUPPLY_VOLTAGE_MIN_DESIGN=3400000
POWER_SUPPLY_VOLTAGE_MAX_DESIGN=4390000
POWER_SUPPLY_CHARGE_FULL_DESIGN=3056000
POWER_SUPPLY_TEMP=31127
POWER_SUPPLY_ESR=99

but the individual sample vary:

[ 9625.368326] qcom-qg 200f000.spmi:pmic@2:qg@4800: esr_sample[0] = 0
[ 9625.684309] qcom-qg 200f000.spmi:pmic@2:qg@4800: esr_sample[1] = 122
[ 9625.996298] qcom-qg 200f000.spmi:pmic@2:qg@4800: esr_sample[2] = 101
[ 9626.308290] qcom-qg 200f000.spmi:pmic@2:qg@4800: esr_sample[3] = 113
[ 9626.620282] qcom-qg 200f000.spmi:pmic@2:qg@4800: esr_sample[4] = 113
[ 9626.932262] qcom-qg 200f000.spmi:pmic@2:qg@4800: esr_sample[5] = 101
[ 9627.244261] qcom-qg 200f000.spmi:pmic@2:qg@4800: esr_sample[6] = 110
[ 9627.556247] qcom-qg 200f000.spmi:pmic@2:qg@4800: esr_sample[7] = 95
[ 9627.868233] qcom-qg 200f000.spmi:pmic@2:qg@4800: esr_sample[8] = 107
[ 9628.180226] qcom-qg 200f000.spmi:pmic@2:qg@4800: esr_sample[9] = 110

If the internal resistance does not depend on the SOC one could do a measurement
in qcom_qg_probe and write it to the qcom_qg_chip structure.

@z3ntu
Copy link

z3ntu commented Sep 23, 2023

Any news on this driver?

M0Rf30 pushed a commit to M0Rf30/linux that referenced this issue Nov 26, 2023
Using netconsole netpoll_poll_dev could be called from interrupt
context, thus using disable_irq() would cause the following kernel
warning with CONFIG_DEBUG_ATOMIC_SLEEP enabled:

  BUG: sleeping function called from invalid context at kernel/irq/manage.c:137
  in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 10, name: ksoftirqd/0
  CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G        W         5.15.42-00075-g816b502b2298-dirty msm8953-mainline#117
  Hardware name: aml (r1) (DT)
  Call trace:
   dump_backtrace+0x0/0x270
   show_stack+0x14/0x20
   dump_stack_lvl+0x8c/0xac
   dump_stack+0x18/0x30
   ___might_sleep+0x150/0x194
   __might_sleep+0x64/0xbc
   synchronize_irq+0x8c/0x150
   disable_irq+0x2c/0x40
   stmmac_poll_controller+0x140/0x1a0
   netpoll_poll_dev+0x6c/0x220
   netpoll_send_skb+0x308/0x390
   netpoll_send_udp+0x418/0x760
   write_msg+0x118/0x140 [netconsole]
   console_unlock+0x404/0x500
   vprintk_emit+0x118/0x250
   dev_vprintk_emit+0x19c/0x1cc
   dev_printk_emit+0x90/0xa8
   __dev_printk+0x78/0x9c
   _dev_warn+0xa4/0xbc
   ath10k_warn+0xe8/0xf0 [ath10k_core]
   ath10k_htt_txrx_compl_task+0x790/0x7fc [ath10k_core]
   ath10k_pci_napi_poll+0x98/0x1f4 [ath10k_pci]
   __napi_poll+0x58/0x1f4
   net_rx_action+0x504/0x590
   _stext+0x1b8/0x418
   run_ksoftirqd+0x74/0xa4
   smpboot_thread_fn+0x210/0x3c0
   kthread+0x1fc/0x210
   ret_from_fork+0x10/0x20

Since [0] .ndo_poll_controller is only needed if driver doesn't or
partially use NAPI. Because stmmac does so, stmmac_poll_controller
can be removed fixing the above warning.

[0] commit ac3d9dd ("netpoll: make ndo_poll_controller() optional")

Cc: <stable@vger.kernel.org> # 5.15.x
Fixes: 47dd7a5 ("net: add support for STMicroelectronics Ethernet controllers")
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/1c156a6d8c9170bd6a17825f2277115525b4d50f.1696429960.git.repk@triplefau.lt
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
M0Rf30 pushed a commit that referenced this issue Apr 1, 2024
…terfaces

[ Upstream commit cb5942b ]

wilc_netdev_cleanup currently triggers a KASAN warning, which can be
observed on interface registration error path, or simply by
removing the module/unbinding device from driver:

echo spi0.1 > /sys/bus/spi/drivers/wilc1000_spi/unbind

==================================================================
BUG: KASAN: slab-use-after-free in wilc_netdev_cleanup+0x508/0x5cc
Read of size 4 at addr c54d1ce by task sh/86

CPU: 0 PID: 86 Comm: sh Not tainted 6.8.0-rc1+ #117
Hardware name: Atmel SAMA5
 unwind_backtrace from show_stack+0x18/0x1c
 show_stack from dump_stack_lvl+0x34/0x58
 dump_stack_lvl from print_report+0x154/0x500
 print_report from kasan_report+0xac/0xd8
 kasan_report from wilc_netdev_cleanup+0x508/0x5cc
 wilc_netdev_cleanup from wilc_bus_remove+0xc8/0xec
 wilc_bus_remove from spi_remove+0x8c/0xac
 spi_remove from device_release_driver_internal+0x434/0x5f8
 device_release_driver_internal from unbind_store+0xbc/0x108
 unbind_store from kernfs_fop_write_iter+0x398/0x584
 kernfs_fop_write_iter from vfs_write+0x728/0xf88
 vfs_write from ksys_write+0x110/0x1e4
 ksys_write from ret_fast_syscall+0x0/0x1c

[...]

Allocated by task 1:
 kasan_save_track+0x30/0x5c
 __kasan_kmalloc+0x8c/0x94
 __kmalloc_node+0x1cc/0x3e4
 kvmalloc_node+0x48/0x180
 alloc_netdev_mqs+0x68/0x11dc
 alloc_etherdev_mqs+0x28/0x34
 wilc_netdev_ifc_init+0x34/0x8ec
 wilc_cfg80211_init+0x690/0x910
 wilc_bus_probe+0xe0/0x4a0
 spi_probe+0x158/0x1b0
 really_probe+0x270/0xdf4
 __driver_probe_device+0x1dc/0x580
 driver_probe_device+0x60/0x140
 __driver_attach+0x228/0x5d4
 bus_for_each_dev+0x13c/0x1a8
 bus_add_driver+0x2a0/0x608
 driver_register+0x24c/0x578
 do_one_initcall+0x180/0x310
 kernel_init_freeable+0x424/0x484
 kernel_init+0x20/0x148
 ret_from_fork+0x14/0x28

Freed by task 86:
 kasan_save_track+0x30/0x5c
 kasan_save_free_info+0x38/0x58
 __kasan_slab_free+0xe4/0x140
 kfree+0xb0/0x238
 device_release+0xc0/0x2a8
 kobject_put+0x1d4/0x46c
 netdev_run_todo+0x8fc/0x11d0
 wilc_netdev_cleanup+0x1e4/0x5cc
 wilc_bus_remove+0xc8/0xec
 spi_remove+0x8c/0xac
 device_release_driver_internal+0x434/0x5f8
 unbind_store+0xbc/0x108
 kernfs_fop_write_iter+0x398/0x584
 vfs_write+0x728/0xf88
 ksys_write+0x110/0x1e4
 ret_fast_syscall+0x0/0x1c
 [...]

David Mosberger-Tan initial investigation [1] showed that this
use-after-free is due to netdevice unregistration during vif list
traversal. When unregistering a net device, since the needs_free_netdev has
been set to true during registration, the netdevice object is also freed,
and as a consequence, the corresponding vif object too, since it is
attached to it as private netdevice data. The next occurrence of the loop
then tries to access freed vif pointer to the list to move forward in the
list.

Fix this use-after-free thanks to two mechanisms:
- navigate in the list with list_for_each_entry_safe, which allows to
  safely modify the list as we go through each element. For each element,
  remove it from the list with list_del_rcu
- make sure to wait for RCU grace period end after each vif removal to make
  sure it is safe to free the corresponding vif too (through
  unregister_netdev)

Since we are in a RCU "modifier" path (not a "reader" path), and because
such path is expected not to be concurrent to any other modifier (we are
using the vif_mutex lock), we do not need to use RCU list API, that's why
we can benefit from list_for_each_entry_safe.

[1] https://lore.kernel.org/linux-wireless/ab077dbe58b1ea5de0a3b2ca21f275a07af967d2.camel@egauge.net/

Fixes: 8399918 ("staging: wilc1000: use RCU list to maintain vif interfaces list")
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240212-wilc_rework_deinit-v1-1-9203ae56c27f@bootlin.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants