Re: [PATCH v2] nvmem: core: Fix race in nvmem_register()

From: Srinivas Kandagatla
Date: Tue Jan 03 2023 - 09:23:34 EST




On 03/01/2023 13:48, Hector Martin wrote:
On 03/01/2023 21.41, Srinivas Kandagatla wrote:


On 03/01/2023 11:44, Hector Martin wrote:
nvmem_register() currently registers the device before adding the nvmem
cells, which creates a race window where consumers may find the nvmem
device (and not get PROBE_DEFERred), but then fail to find the cells and
error out.

Move device registration to the end of nvmem_register(), to close the
race.

Observed when the stars line up on Apple Silicon machines with the (not
yet upstream, but trivial) spmi nvmem driver and the macsmc-rtc client:

[ 0.487375] macsmc-rtc macsmc-rtc: error -ENOENT: Failed to get rtc_offset NVMEM cell

Fixes: eace75cfdcf7 ("nvmem: Add a simple NVMEM framework for nvmem providers")
Cc: stable@xxxxxxxxxxxxxxx
Reviewed-by: Eric Curtin <ecurtin@xxxxxxxxxx>
Signed-off-by: Hector Martin <marcan@xxxxxxxxx>
---

What has changed since v1?

What you told me to. I'm trying to get a silly bug fixed after you
ignored my original patch until Russell independently discovered and
submitted a fix for the same thing. I think we've wasted enough
developer time here already.

You should remember that maintainers have other regular job and holidays apart from maintaining. When you last sent the patch we are already in near/middle of merge window. If I had applied your original patch without proper review, It would have introduced more regressions. Be patient and we/I understand your concern.

Its always possible that multiple developers hit same bug and endup in multiple patches, there is no way to avoid this unless every developer checks for similar patches on the list.



drivers/nvmem/core.c | 32 +++++++++++++++++---------------
1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index 321d7d63e068..606f428d6292 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -822,11 +822,8 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
break;
}

- if (rval) {
- ida_free(&nvmem_ida, nvmem->id);
- kfree(nvmem);
- return ERR_PTR(rval);
- }
+ if (rval)
+ goto err_gpiod_put;

Why was gpiod changes added to this patch, that should be a separate
patch/discussion, as this is not relevant to the issue that you are
reporting.

Because freeing the device also does a gpiod_put in the destructor, so
This are clearly untested, And I dont want this to be in the middle to fix to the issue you are hitting.
We should always be careful about untested changes, in this case gpiod has some conditions to check before doing a put. So the patch is incorrect as it is.

doing this is correct in every other instance below and maintains
existing behavior, and it just so happens that this instance converges
into the same codepath so it is correct to merge it, and it just so
happens that the gpiod put was missing in this path to begin with so
this becomes a drive-by bugfix.

If you don't like it I can remove it (i.e. reintroduce the bug for no
good reason) and you can submit this fix yourself, because I have no
incentive to waste time submitting a separate patch to fix a GPIO leak
in an error path corner case in a subsystem I don't own and I have much
bigger things to spend my (increasingly lower and lower) willingness to
fight for upstream submissions than this.

Seriously, what is wrong with y'all kernel people. No other open source
project wastes contributors' time with stupid nitpicks like this. I

These are not stupid nit picks your v1/v2 patches introduced memory leaks and regressions so i will not be picking up any patches that fall in that area.

found a bug, I fixed it, I then fixed the issues you pointed out, and I
don't have the time nor energy to fight over this kind of nonsense next.

I think its worth reading ./Documentation/process/submitting-patches.rst


thanks,
Srini
Do you want bugs fixed or not?


nvmem->read_only = device_property_present(config->dev, "read-only") ||
config->read_only || !nvmem->reg_write;
@@ -837,20 +834,16 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)

dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name);

- rval = device_register(&nvmem->dev);
- if (rval)
- goto err_put_device;
-
if (nvmem->nkeepout) {
rval = nvmem_validate_keepouts(nvmem);
if (rval)
- goto err_device_del;
+ goto err_gpiod_put;
}

if (config->compat) {
rval = nvmem_sysfs_setup_compat(nvmem, config);
if (rval)
- goto err_device_del;
+ goto err_gpiod_put;
}

if (config->cells) {
@@ -867,6 +860,15 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
if (rval)
goto err_remove_cells;

+ rval = device_register(&nvmem->dev);
+ if (rval) {
+ nvmem_device_remove_all_cells(nvmem);
+ if (config->compat)
+ nvmem_sysfs_remove_compat(nvmem, config);
+ put_device(&nvmem->dev);
+ return ERR_PTR(rval);
+ }
+
blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);

return nvmem;
@@ -876,10 +878,10 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
err_teardown_compat:
if (config->compat)
nvmem_sysfs_remove_compat(nvmem, config);
-err_device_del:
- device_del(&nvmem->dev);
-err_put_device:
- put_device(&nvmem->dev);
+err_gpiod_put:
+ gpiod_put(nvmem->wp_gpio);
+ ida_free(&nvmem_ida, nvmem->id);
+ kfree(nvmem);

return ERR_PTR(rval);
}
--
2.35.1




- Hector