Re: [PATCH] dax/hmem: Fix refcount leak in dax_hmem_probe()

From: Dan Williams
Date: Sat Jun 03 2023 - 00:12:06 EST


Paul Cassella wrote:
> On Fri, 2 Jun 2023, Ira Weiny wrote:
> > Paul Cassella wrote:
> > > On Sat, 3 Dec 2022, Ira Weiny wrote:
> > > > On Sat, Dec 03, 2022 at 09:58:58AM +0000, Yongqiang Liu wrote:
>
> > > > > We should always call dax_region_put() whenever devm_create_dev_dax()
> > > > > succeed or fail to avoid refcount leak of dax_region. Move the return
> > > > > value check after dax_region_put().
>
> > > > I think dax_region_put is called from dax_region_unregister() automatically on
> > > > tear down.
>
> > > Note the reference dax_region_unregister() will be putting is the one
> > > devm_create_dev_dax() takes by kref_get(&dax_region->kref). I think
> > > dax_hmem_probe() needs to put its reference in the error case, as in the
> > > successful case.
>
> > Looking at this again I'm inclined to agree that something is wrong. But
> > I'm not sure this patch fixes it. anything.
>
> > When you say:
> >
> > > ... I think
> > > dax_hmem_probe() needs to put its reference in the error case, as in the
> > > successful case.
> >
> > ... which kref_get() is dax_hmem_probe() letting go?
>
> Isn't it letting go of the initial kref_init() reference from
> alloc_dax_region()?
>
> Sorry, I had gone through the code a little carelessly yesterday. Now I
> think that kref_init() reference is the one that dax_hmem_probe() is
> dropping in the success case, and which still needs to be dropped in the
> error case.
>
> (If so, I think the alloc_dax_region() path that returns NULL on
> devm_add_action_or_reset() failure, releasing the kref_get reference, will
> leak the kref_init reference and the region.)

Yes, *this* one looks valid. All the other patches in this thread had me
asking, "um, did you try it?".

Now, that said, the games that this init path is playing are hard to
follow and it should not be the case that alloc_dax_region() needs to
hold a reference on behalf of some future code path that might need it.

Lets make this clearer by moving the reference count management closer
to the object that needs it dax_region->ida. Where the extra references
were being taken on behalf of "static" regions just in case they needed
to be reference in dev_dax_release().

I also found a dax_mapping lifetime issue while testing this, so I
appreciate the focus here.