Re: [PATCH] dax/hmem: Fix refcount leak in dax_hmem_probe()

From: Ira Weiny
Date: Fri Jun 02 2023 - 19:34:42 EST


Paul Cassella wrote:
> On Fri, 2 Jun 2023, Ira Weiny wrote:
> > Paul Cassella wrote:
> > > On Sat, 3 Dec 2022, Ira Weiny wrote:
> > > > On Sat, Dec 03, 2022 at 09:58:58AM +0000, Yongqiang Liu wrote:
>
> > > > > We should always call dax_region_put() whenever devm_create_dev_dax()
> > > > > succeed or fail to avoid refcount leak of dax_region. Move the return
> > > > > value check after dax_region_put().
>
> > > > I think dax_region_put is called from dax_region_unregister() automatically on
> > > > tear down.
>
> > > Note the reference dax_region_unregister() will be putting is the one
> > > devm_create_dev_dax() takes by kref_get(&dax_region->kref). I think
> > > dax_hmem_probe() needs to put its reference in the error case, as in the
> > > successful case.
>
> > Looking at this again I'm inclined to agree that something is wrong. But
> > I'm not sure this patch fixes it. anything.
>
> > When you say:
> >
> > > ... I think
> > > dax_hmem_probe() needs to put its reference in the error case, as in the
> > > successful case.
> >
> > ... which kref_get() is dax_hmem_probe() letting go?
>
> Isn't it letting go of the initial kref_init() reference from
> alloc_dax_region()?

Oh wow! I did not realize that about kref_init()... :-(

>
> Sorry, I had gone through the code a little carelessly yesterday. Now I
> think that kref_init() reference is the one that dax_hmem_probe() is
> dropping in the success case, and which still needs to be dropped in the
> error case.

yea ok...

>
> (If so, I think the alloc_dax_region() path that returns NULL on
> devm_add_action_or_reset() failure, releasing the kref_get reference, will
> leak the kref_init reference and the region.)

I see now. The ref is taken prior to the add action or reset to ensure
the dax_region does not go away should the platform device go away...

However, in all the call paths currently the device passed to
alloc_dax_region() can't go away prior to the dev_dax taking a reference.
So I don't think this extra reference is required. :-/

As you say the ref counting right now is not correct on error flows. But
I'm torn on the correct solution.

I think a small series broken out so it can be backported if needed would
be best.

Ira