Re: [PATCH 03/19] fs: release anon dev_t in deactivate_locked_super

From: Al Viro
Date: Tue Sep 26 2023 - 18:52:05 EST


On Tue, Sep 26, 2023 at 11:38:34AM +0200, Christoph Hellwig wrote:

> How?
>
> Old sequence before his patch:
>
> deactivate_locked_super()
> -> kill_anon_super()
> -> generic_shutdown_super()
> -> kill_super_notify()
> -> free_anon_bdev()
> -> kill_super_notify()
>
> New sequence with this patch:
>
> deactivate_locked_super()
> -> generic_shutdown_super()
> -> kill_super_notify()
> -> free_anon_bdev()
>

Before your patch: foo_kill_super() calls kill_anon_super(),
which calls kill_super_notify(), which removes the sucker from
the list, then frees ->s_fs_info. After your patch:
removal from the lists happens via the call of kill_super_notify()
*after* both of your methods had been called, while freeing
->s_fs_info happens from the method call. IOW, you've restored
the situation prior to "super: ensure valid info". The whole
point of that commit had been to make sure that we have nothing
in the lists with ->s_fs_info pointing to a freed object.

It's not about free_anon_bdev(); that part is fine - it's the
"we can drop the weird second call site of kill_super_notify()"
thing that is broken.

Al, still slogging through the rcu pathwalk races in the methods...
The latest catch: nfs_set_verifier() can get called on a dentry
that had just been seen to have positive parent, but is not
pinned down.
grab ->d_lock; OK, we know that dentry won't get freed under us
fetch ->d_parent->d_inode
pass that to nfs_verify_change_attribute()
... which assumes that inode it's been given is not NULL. Normally it
would've been - ->d_lock stabilizes ->d_parent, and negative dentries
obviously have no children. Except that we might've been just hit
by dentry_kill() due to eviction on memory pressure, got ->d_lock
right after that and proceeded to play with ->d_parent, just as
that parent is going through dentry_kill() from the same eviction on
memory pressure... If it gets to dentry_unlink_inode() before we get to
fetching ->d_parent->d_inode, nfs_verify_change_attribute(NULL, whatever)
is going to oops...