Re: [PATCH RFC 7/7] libfs: Re-arrange locking in offset_iterate_dir()
From: Jan Kara
Date: Thu Feb 15 2024 - 12:19:13 EST
On Thu 15-02-24 12:00:08, Liam R. Howlett wrote:
> * Jan Kara <jack@xxxxxxx> [240215 08:16]:
> > On Tue 13-02-24 16:38:08, Chuck Lever wrote:
> > > From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > >
> > > Liam says that, unlike with xarray, once the RCU read lock is
> > > released ma_state is not safe to re-use for the next mas_find() call.
> > > But the RCU read lock has to be released on each loop iteration so
> > > that dput() can be called safely.
> > >
> > > Thus we are forced to walk the offset tree with fresh state for each
> > > directory entry. mt_find() can do this for us, though it might be a
> > > little less efficient than maintaining ma_state locally.
> > >
> > > Since offset_iterate_dir() doesn't build ma_state locally any more,
> > > there's no longer a strong need for offset_find_next(). Clean up by
> > > rolling these two helpers together.
> > >
> > > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> >
> > Well, in general I think even xas_next_entry() is not safe to use how
> > offset_find_next() was using it. Once you drop rcu_read_lock(),
> > xas->xa_node could go stale. But since you're holding inode->i_rwsem when
> > using offset_find_next() you should be protected from concurrent
> > modifications of the mapping (whatever the underlying data structure is) -
> > that's what makes xas_next_entry() safe AFAIU. Isn't that enough for the
> > maple tree? Am I missing something?
>
> If you are stopping, you should be pausing the iteration. Although this
> works today, it's not how it should be used because if we make changes
> (ie: compaction requires movement of data), then you may end up with a
> UAF issue. We'd have no way of knowing you are depending on the tree
> structure to remain consistent.
I see. But we have versions of these structures that have locking external
to the structure itself, don't we? Then how do you imagine serializing the
background operations like compaction? As much as I agree your argument is
"theoretically clean", it seems a bit like a trap and there are definitely
xarray users that are going to be broken by this (e.g.
tag_pages_for_writeback())...
Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR