Re: [PATCH] dax: make sure inodes are flushed before destroy cache

From: Ira Weiny
Date: Mon Feb 14 2022 - 18:13:04 EST


On Mon, Feb 14, 2022 at 12:09:54PM -0800, Dan Williams wrote:
> On Mon, Feb 14, 2022 at 9:59 AM Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
> >
> > On Fri, Feb 11, 2022 at 11:11:11PM -0800, Tong Zhang wrote:
> > > A bug can be triggered by following command
> > >
> > > $ modprobe nd_pmem && modprobe -r nd_pmem
> > >
> > > [ 10.060014] BUG dax_cache (Not tainted): Objects remaining in dax_cache on __kmem_cache_shutdown()
> > > [ 10.060938] Slab 0x0000000085b729ac objects=9 used=1 fp=0x000000004f5ae469 flags=0x200000000010200(slab|head|node)
> > > [ 10.062433] Call Trace:
> > > [ 10.062673] dump_stack_lvl+0x34/0x44
> > > [ 10.062865] slab_err+0x90/0xd0
> > > [ 10.063619] __kmem_cache_shutdown+0x13b/0x2f0
> > > [ 10.063848] kmem_cache_destroy+0x4a/0x110
> > > [ 10.064058] __x64_sys_delete_module+0x265/0x300
> > >
> > > This is caused by dax_fs_exit() not flushing inodes before destroy cache.
> > > To fix this issue, call rcu_barrier() before destroy cache.
> >
> > I don't doubt that this fixes the bug. However, I can't help but think this is
> > hiding a bug, or perhaps a missing step, in the kmem_cache layer? As far as I
> > can see dax does not call call_rcu() and only uses srcu not rcu? I was tempted
> > to suggest srcu_barrier() but dax does not call call_srcu() either.
>
> This rcu_barrier() is associated with the call_rcu() in destroy_inode().

Ok yea.

>
> While kern_unmount() does a full sycnrhonize_rcu() after clearing
> ->mnt_ns. Any pending destroy_inode() callbacks need to be flushed
> before the kmem_cache is destroyed.
>
> > So I'm not clear about what is really going on and why this fixes it. I know
> > that dax is not using srcu is a standard way so perhaps this helps in a way I
> > don't quite grok? If so perhaps a comment here would be in order?
>
> Looks like a common pattern I missed that all filesystem exit paths implement.

I think a comment would be in order, especially since since it looks like every
other FS has one:

fs/ext4/super.c:

...
/*
* Make sure all delayed rcu free inodes are flushed before we
* destroy cache.
*/
rcu_barrier();
...

Anyway ok.

Reviewed-by: Ira Weiny <ira.weiny@xxxxxxxxx>

Thanks for looking Dan,
Ira