Re: [PATCH 3.2 38/52] Btrfs: fix race when listing an inode's xattrs

From: Ben Hutchings
Date: Wed Nov 25 2015 - 19:39:50 EST


On Wed, 2015-11-25 at 23:11 +0000, Luis Henriques wrote:
> On Tue, Nov 24, 2015 at 10:33:59PM +0000, Ben Hutchings wrote:
> > 3.2.74-rc1 review patch.ÂÂIf anyone has any objections, please let me know.
> >
> > ------------------
> >
> > From: Filipe Manana <fdmanana@xxxxxxxx>
> >
> > commit f1cd1f0b7d1b5d4aaa5711e8f4e4898b0045cb6d upstream.
> >
> > When listing a inode's xattrs we have a time window where we race against
> > a concurrent operation for adding a new hard link for our inode that makes
> > us not return any xattr to user space. In order for this to happen, the
> > first xattr of our inode needs to be at slot 0 of a leaf and the previous
> > leaf must still have room for an inode ref (or extref) item, and this can
> > happen because an inode's listxattrs callback does not lock the inode's
> > i_mutex (nor does the VFS does it for us), but adding a hard link to an
> > inode makes the VFS lock the inode's i_mutex before calling the inode's
> > link callback.
> >
> > If we have the following leafs:
> >
> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂLeaf X (has N items)ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂLeaf Y
> >
> > Â[ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ]ÂÂ[ (257 XATTR_ITEM 12345), ... ]
> > ÂÂÂÂÂÂÂÂÂÂÂslot N - 2ÂÂÂÂÂÂÂÂÂslot N - 1ÂÂÂÂÂÂÂÂÂÂÂÂÂÂslot 0
> >
> > The race illustrated by the following sequence diagram is possible:
> >
> > ÂÂÂÂÂÂÂCPU 1ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂCPU 2
> >
> > Â btrfs_listxattr()
> >
> > ÂÂÂÂsearches for key (257 XATTR_ITEM 0)
> >
> > ÂÂÂÂgets path with path->nodes[0] == leaf X
> > ÂÂÂÂand path->slots[0] == N
> >
> > ÂÂÂÂbecause path->slots[0] is >=
> > ÂÂÂÂbtrfs_header_nritems(leaf X), it calls
> > ÂÂÂÂbtrfs_next_leaf()
> >
> > ÂÂÂÂbtrfs_next_leaf()
> > ÂÂÂÂÂÂreleases the path
> >
> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂadds key (257 INODE_REF 666)
> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂto the end of leaf X (slot N),
> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂand leaf X now has N + 1 items
> >
> > ÂÂÂÂÂÂsearches for the key (257 INODE_REF 256),
> > ÂÂÂÂÂÂwith path->keep_locks == 1, because that
> > ÂÂÂÂÂÂis the last key it saw in leaf X before
> > ÂÂÂÂÂÂreleasing the path
> >
> > ÂÂÂÂÂÂends up at leaf X again and it verifies
> > ÂÂÂÂÂÂthat the key (257 INODE_REF 256) is no
> > ÂÂÂÂÂÂlonger the last key in leaf X, so it
> > ÂÂÂÂÂÂreturns with path->nodes[0] == leaf X
> > ÂÂÂÂÂÂand path->slots[0] == N, pointing to
> > ÂÂÂÂÂÂthe new item with key (257 INODE_REF 666)
> >
> > ÂÂÂÂbtrfs_listxattr's loop iteration sees that
> > ÂÂÂÂthe type of the key pointed by the path is
> > ÂÂÂÂdifferent from the type BTRFS_XATTR_ITEM_KEY
> > ÂÂÂÂand so it breaks the loop and stops looking
> > ÂÂÂÂfor more xattr items
> > ÂÂÂÂÂÂ--> the application doesn't get any xattr
> > ÂÂÂÂÂÂÂÂÂÂlisted for our inode
> >
> > So fix this by breaking the loop only if the key's type is greater than
> > BTRFS_XATTR_ITEM_KEY and skip the current key if its type is smaller.
> >
> > Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
> > [bwh: Backported to 3.2: s/found_key\.type/btrfs_key_type(\&found_key)/]
>
> Actually, in my backport to 3.16 I decided to keep the usage of
> 'found_key.type' instead, as the usage of btrfs_key_type() has been
> dropped with commit 962a298f3511 ("btrfs: kill the key type accessor
> helpers").
[...]

OK, that makes sense. Âbtrfs in 3.2 is pretty inconsistent about using
btrfs_key_type() anyway.

Ben.


--
Ben Hutchings
Unix is many things to many people,
but it's never been everything to anybody.

Attachment: signature.asc
Description: This is a digitally signed message part