Re: Hang in wait_on_inode with SMP 2.1.87

Linus Torvalds (torvalds@transmeta.com)
23 Feb 1998 08:14:53 GMT


In article <Pine.LNX.3.96.980222142802.13384A-100000@kanga.eecs.umich.edu>,
Steve Hsieh <steveh@eecs.umich.edu> wrote:
>
>I think I have a similar problem, I believe starting around 2.1.8x.
>If there's heavy disk activity, whatever process is involved gets
>stuck, and I can't kill it. Unlike Carsten, though, it is repeatable
>-- if I do a 'cp -a /usr /mnt' where a different drive partition is
>mounted in /mnt, cp will hang.

Could you try two things:
- upgrade to 2.1.88 (unless you already have)
- test this "strange" patch to __wait_on_inode():

static void __wait_on_inode(struct inode * inode)
{
struct wait_queue wait = { current, NULL };

add_wait_queue(&inode->i_wait, &wait);
repeat:
current->state = TASK_UNINTERRUPTIBLE;
+ __asm__ __volatile__("cpuid": : :"ax", "bx", "cx", "dx", "memory");
if (inode->i_state & I_LOCK) {
schedule();
goto repeat;
}
remove_wait_queue(&inode->i_wait, &wait);
current->state = TASK_RUNNING;
}

(The above is just a pseudo-patch, but as it's only one line you should
get the idea).

I haven't tested it myself, and for all I know it may be completely
bogus, but this is something that came up withe the same function wrt
the buffer cache, where another ordering change made a difference to
some people. The only reason I can think of is a serialization thing,
and while I don't actually believe in it it is certainly worth testing
if this is reasonable easily repeatable for some people.

(Intel documents "cpuid" as being a serializing instruction, so it will
force the CPU to not re-order anything around that particular place. I
currently cannot see how this could make a difference, but I'm not
completely infallible and the above is easy enough to test ;)

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu