Re: [PATCH v2] nvmem: core: Fix race in nvmem_register()

From: Hector Martin
Date: Tue Jan 03 2023 - 09:57:35 EST


On 03/01/2023 23.22, Srinivas Kandagatla wrote:
>>>> drivers/nvmem/core.c | 32 +++++++++++++++++---------------
>>>> 1 file changed, 17 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
>>>> index 321d7d63e068..606f428d6292 100644
>>>> --- a/drivers/nvmem/core.c
>>>> +++ b/drivers/nvmem/core.c
>>>> @@ -822,11 +822,8 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
>>>> break;
>>>> }
>>>>
>>>> - if (rval) {
>>>> - ida_free(&nvmem_ida, nvmem->id);
>>>> - kfree(nvmem);
>>>> - return ERR_PTR(rval);
>>>> - }
>>>> + if (rval)
>>>> + goto err_gpiod_put;
>>>
>>> Why was gpiod changes added to this patch, that should be a separate
>>> patch/discussion, as this is not relevant to the issue that you are
>>> reporting.
>>
>> Because freeing the device also does a gpiod_put in the destructor, so
> This are clearly untested, And I dont want this to be in the middle to
> fix to the issue you are hitting.

I somehow doubt you tested any of these error paths either. Nobody tests
initialization error paths. That's why there was a gpio leak here to
begin with.

> We should always be careful about untested changes, in this case gpiod
> has some conditions to check before doing a put. So the patch is
> incorrect as it is.

Then the existing code is also incorrect as it is, because the device
release callback is doing the same gpiod_put() already. I just moved it
out since we are now registering the device later.

> These are not stupid nit picks your v1/v2 patches introduced memory
> leaks and regressions so i will not be picking up any patches that fall
> in that area.

v1 certainly had issues which you rightly pointed out. Now with v2 you
are nitpicking that I merged two codepaths that belong together, and
fixed a corner case bug in the process. If there is an actual problem,
please point it out. "Lol why did you fix this other bug that naturally
falls into the influence of the changes being made" is not constructive.

>> found a bug, I fixed it, I then fixed the issues you pointed out, and I
>> don't have the time nor energy to fight over this kind of nonsense next.
>
> I think its worth reading ./Documentation/process/submitting-patches.rst

Oh I've read it alright. You're not the first maintainer to have me lose
more faith in the kernel development process, this isn't my first rodeo
(hint: check MAINTAINERS, I'm also a maintainer). And I know it doesn't
have to be like this because other maintainers that are actually
reasonable and nice to work with do exist.

I'm going to note that right now this core subsystem code is broken in
the *success path*, while all the arguments here are about the *error
path*. In other words, there is a negligible change that any
mistakes/regressions I'm making here would actually impact people, while
the status quo is that the code is actively breaking people's systems.

So, are you going to actually point out the supposed regressions with v2
so we can actually fix this bug, or should I just drop this submission
and keep it in my downstream tree and you can continue to get bug
reports about this race condition?

- Hector