Re: linux-next: cgroup_mount() falls asleep forever

From: Al Viro
Date: Wed Sep 24 2014 - 14:52:22 EST


On Wed, Sep 24, 2014 at 06:29:27PM +0400, Andrey Wagin wrote:
> 2014-09-24 14:31 GMT+04:00 Andrey Wagin <avagin@xxxxxxxxx>:
> > Hi All,
>
> The problem is in a following commit:
>
> commit 0c7bf3e8cab7900e17ce7f97104c39927d835469
> Author: Zefan Li <lizefan@xxxxxxxxxx>
> Date: Sat Sep 20 14:49:10 2014 +0800
>
> cgroup: remove redundant variable in cgroup_mount()
>
> Both pinned_sb and new_sb indicate if a new superblock is needed,
> so we can just remove new_sb.
>
> Note now we must check if kernfs_tryget_sb() returns NULL, because
> when it returns NULL, kernfs_mount() may still re-use an existing
> superblock, which is just allocated by another concurent mount.
>
> Suggested-by: Tejun Heo <tj@xxxxxxxxxx>
> Signed-off-by: Zefan Li <lizefan@xxxxxxxxxx>
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>

Lovely... First of all, that thing is obviously racy - there's nothing
to prevent another mount happening between these two places. Moreover,
kernfs_mount() calling conventions are really atrocious - pointer to
bool just to indicate that superblock is new?

Could somebody explain WTF is the whole construction trying to do? Not
to mention anything else, what *does* this pinning a superblock protect
from? Suppose we have a superblock for the same root with non-NULL ns
and _that_ gets killed. We get hit by the same
percpu_ref_kill(&root->cgrp.self.refcnt);
so what's the point of pinned_sb? Might as well have just bumped the
refcount, superblock or no superblock. And no, delaying that kernfs_kill_sb()
does you no good whatsoever - again, pinned_sb might have nothing to do with
the superblock we are after.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/