Re: [2.6.38-3.x] [BUG] soft lockup - CPU#X stuck for 23s! (vfs, autofs, vserver)

From: Herbert Poetzl
Date: Mon Sep 24 2012 - 07:48:36 EST


On Mon, Sep 24, 2012 at 07:23:55AM +0200, PaweÅ Sikora wrote:
> On Sunday 23 of September 2012 18:10:30 Linus Torvalds wrote:
>> On Sat, Sep 22, 2012 at 11:09 PM, PaweÅ Sikora <pluto@xxxxxxxxxxxxx> wrote:

>>> br_read_lock(vfsmount_lock);

>> The vfsmount_lock is a "local-global" lock, where a read-lock
>> is rather cheap and takes just a per-cpu lock, but the
>> downside is that a write-lock is *very* expensive, and can
>> cause serious trouble.

>> And the write lock is taken by the [un]mount() paths. Do *not*
>> do crazy things. If you do some insane "unmount and remount
>> autofs" on a 1s granularity, you're doing insane things.

>> Why do you have that 1s timeout? Insane.

> 1s unmount timeout is *only* for fast bug reproduction (in few
> seconds after opteron startup) and testing potential patches.
> normally with 60s timeout it happens in few minutes..hours
> (depends on machine i/o+cpu load) and makes server unusable
> (permament soft-lockup).

> can we redesign vserver's mnt_is_reachable() for better locking
> to avoid total soft-lockup?

currently we do:

br_read_lock(&vfsmount_lock);
root = current->fs->root;
root_mnt = real_mount(root.mnt);
point = root.dentry;

while ((mnt != mnt->mnt_parent) && (mnt != root_mnt)) {
point = mnt->mnt_mountpoint;
mnt = mnt->mnt_parent;
}

ret = (mnt == root_mnt) && is_subdir(point, root.dentry);
br_read_unlock(&vfsmount_lock);

and we have been considering to move the br_read_unlock()
right before the is_subdir() call

if there are any suggestions how to achieve the same
with less locking I'm all ears ...

best,
Herbert

> BR,
> PaweÅ.

> ps).
> i'm adding Herbert to CC.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/