Re: "movb" for spin-unlock (was Re: namei() query)

From: Oliver Xymoron (oxymoron@waste.org)
Date: Sat Apr 22 2000 - 10:58:52 EST


On Sat, 22 Apr 2000, Jamie Lokier wrote:

> Linus Torvalds wrote:
> > I have conflicting reports about the safety of "movb" from Intel.
> > According to some people in there, "movb" is always safe, and there should
> > not be any need for any config option at all.
> >
> > However, at the same time my original contact at intel was Andy Glew, who
> > probably knows more about the ia32 core than anybody else I know. And Andy
> > says that yes "movb" is legal, but that some very early P6 steppings may
> > be buggy. And Andy is God.
>
> That comment in <asm-i386/spinlock.h> is rather tantalising. It says
> don't use "movb" because it doesn't work but gives no clues why.
>
> I still have the thread where this was hashed out. And it seemed very
> few people ended up understanding the precise reason for not using
> "movb". Not me :-(

There are very few things that could cause the movb to be a problem. For
instance, it can't be in the cache coherency protocol as the unlock can
be lazy at it likes and still be safe. My only guess is that somehow the
movb can get scheduled ahead of reads or writes inside the critical
section. If that's the case, then the whole coherency scheme is broken,
no? We'd need to rethink quite a number of things we've presumed safe.
My guess is the whole thing is apocryphal.
 
> Which is unfortunate, because I am trying to develop a model for machine
> reasoning about ia32 instruction sequences. To generate better code,
> and to check it. Even in user space, these SMP subtleties are
> important. And I want to analyse kernel code :-)
>
> So...
>
> > I'd hate to have a kernel that works 99% of the time but then has
> > occasional problems on some very rare machines that are really hard to
> > track down. But I'd _almost_ like to just make the movb the default and
> > have a CONFIG_BROKEN_P6_ORDERING options for the very very special
> > case.
>
> Could you ask Andy Glew which steppings are broken in this regard? Is
> there a documented erratum? They usually list the broken steppings. Is
> there a reliable test for the problem, preferably a fast one that can
> run at boot? (I doubt it).

If we had some more detailed info about it, we could probably come up with
a testcase that worked in ~1 sec and only ran for P6 stepping < n. I'd
first like some actual confirmation that there is a bug. I'll go take a
look at the errata.. Failing that, we might revive Manfred's lock test
program and get anyone who still has a PPro to give it a run.

--
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.." 

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Apr 23 2000 - 21:00:20 EST