Re: More on the pentium workaround - the gotchas

Linus Torvalds (torvalds@transmeta.com)
Sat, 15 Nov 1997 11:55:54 -0800 (PST)


On Sat, 15 Nov 1997, Alan Cox wrote:
> > From: "Charles M. Hannum" <mycroft@MIT.EDU>
> > Subject: BSDI patch for Pentium workaround has problems
> >
> > [I sent the body of this to people at BSDI and Intel after looking at
> > the official release version of the BSDI patch.]
> >
> > In addition to the concerns I posted on bugtraq regarding handling of
> > INTO and BOUND instructions, and the (albeit minor) differences in
> > handling INT $0, INT $1, INT $2, and INT $6 from user code, the new
> > revision of the BSDI patch fails in two additional ways:
> >
> > It directly accesses a linear address in user space using the kernel
> > segment descriptors, ignoring that the process may be in VM86 mode or
> > 16-bit protected mode. (You might be able to ignore protected mode if
> > you don't allow the user to create segment descriptors. We do to
> > support WINE and WABI.) Not only will it therefore get the PC (%eip)
> > fixup wrong in these modes, but it may also cause an unhandled page
> > fault in kernel space, which will cause the kernel to crash. This is
> > highly suboptimal.

This is the Intel patch - I sent it out to the kernel list with comments
about some of the short-comings of the patch. Charles is right about the
problems, although they don't actually apply to the current Linux version
of the patch. They _do_ apply to the intel one, exactly because the intel
one tries to be more clever, and fails in subtle ways.

> > If you're going to look at the user instruction (which you *need* to
> > do to properly handle BOUND), then you must do the segment
> > translations correctly. Note that there's a race condition here in
> > SMP systems, but in practice this is minor; if the user changes the
> > instruction while we're doing the fixup, the fixup will do something
> > not quite right, but should not create a security hole.

The "bound" instruction is pretty much uninteresting - I don't think
anybody really uses it. I tend to think that it is MUCH worse to try to be
overly clever than to be simple and get the thing working for all normal
cases. That's why I actually like the "stupid" patch by Hans Lermen that
just increments eip by one for the "int3" case, because while that patch
is conceptually "wrong", it actually never gets things _really_ messed up
like the clever patches can.

> > I include below three pieces of mail from me about this on bugtraq.
> > (Note that my suggested way of reexecuting the instruction actually
> > can't work correctly in a SMP system, but I include it here for
> > completeness. Basically, the user could change the instruction before
> > it's reexecuted to be something that doesn't trap, then do a bunch of
> > things to cause the cache to be completely flushed, and do the hanging
> > instruction again while we're still pointing to the fully mapped IDT,
> > causing the hang. We could try to work around this using the trace
> > flag to force an exception, but in protected mode the user can change
> > the trace flag.)

Note that you don't even need to use SMP to have problems: you can use
self-modifying code on a UP Pentium. Something like this is likely to
cause _major_ problems if the BSDi patches really do re-map the IDT (I
also thought that it might be a good idea to re-map the IDT to make the
CPU itself to the right thing, but I'm now of the opinion that trying to
do that is just stupid).

movl $0xf00fc7c8,1f
1: int3

which will take the int3 trap (because it has been pre-fetched), but then
if the page fault handler notices that it was an int3 trap and switches
back to the original IDT and re-executes it it will now hang the machine.
Oops.

The more I think about this, the more I start to believe that Hans Lermens
patch is the best one after all. It's too simple-minded to get all cases
correct, but it never does anything really bad (it essentially only breaks
on code that is bad in the first place).

Linus