Re: 2.6.0-test3-mm1: scheduling while atomic (ext3?)

From: Andi Kleen
Date: Wed Aug 13 2003 - 08:20:40 EST


On Wed, Aug 13, 2003 at 01:48:45PM +0100, Alan Cox wrote:
> On Mer, 2003-08-13 at 13:10, Andi Kleen wrote:
> > > Beats me, but then the prefetch code in 2.6 seems broken from
> > > 5 seconds of inspection anyway. We are testing the XMM feature
> > > and using prefetchnta for Athlon, thats wrong for lots of athlon
> > > processors that dont have XMM but do have prefetch/prefetchw,
> > > (which btw also seem to work properly on all these processors
> > > while prefetchnta seems to do funky things)
> >
> > The early Athlon Specific test was not done to avoid too much bloat.
> > (three alternatives instead of two)
>
> Lets replace working code with broken macros whoooo.. progress. Lots of
> Athlons don't have XMM, most of the older ones where prefetch has the
> most impact in fact. (The XMM using ones have the hw prefetcher too).

hw prefetch has nothing to do with how the linux kernel uses prefetch.
It's only using it for data structures that cannot be handled by
the auto prefetcher.

[except the broken 3dnow! copy that was never enabled]


>
> > Most Athlons in existence should have XMM already and the rest works.
>
> Lots don't have XMM

All XPs have.

>
> > You can hardly call that broken.
>
> I just did. Its worse than 2.4 behaviour.

In 2.4 distribution users never got anything. That is what was really
broken.

>
> > That's done for write prefetches correctly.
> > (as Intel does not have a write prefetch)
>
> Actually its iffy too. 3Dnow doesnt imply prefetchw. You

My AMD manual lists it as part of 3dnow. If an CPU advertises 3dnow!
but doesn't have the instruction it's broken.


> must test 3Dnow && vendor==AMD && Athlon. (K6 prefetchw
> is slower than not using it, other 3Dnow chips dont have it
> eg the Cyrix MII which may explain a couple of things. I don't

I would consider the MII broken then. setup should clear the 3dnow
bit.

> see anywhere we mask the 3Dnow property by these but I've not
> dug through the CPU code right now to see if we have a "3Dnowplus"
> type definition we can check.

there is 3dnowext, set on Athlons, but K6 has prefetchw too and
it

But if you only want Athlon you can check for X86_FEATURE_K7.
The problem is that it doesn't include K8 and K8
has prefetchw too (alternative currently only allows a single
bit, not a bitmask). Better is to either clear 3dnow on the MII
or define a new pseudo bit that defines working and useful
prefetchw


>
> I suspect the best way to do prefetch cleanly would be something
> like this
>
> #if defined(CONFIG_MK7)
> alternative_input("prefetch" or "prefetchnta")
> #else
> alternative_input(ASM_NOP4 or "prefetchnta");
> #endif

No for weird combinations you define a new pseudo CPUID capability
bit, check for that in the CPU detection and use that in the alternative.

If you really want 3 way alternative you can just define a macro
for it. The basic data structure supports it - the macro
just needs to have two .altinstructions records and two replacement codes.
But I have my doubt it is worth it for this case.

No stinkin' ifdefs please, that would break the whole concept.

>
> Ideally we want a 3 way patch table to fix up at boot time but the if
> case at least gets us back to desirable situations. Also if I remember
> the prefetch exception thing rightly you can misalign the prefetch
> instruction as a workaround.

Nope, no misalignment. All it does is to just handle the exception
using __ex_table and jumps to the next instruction.

[the exceptions are very rare, they need very specific circumstances
in the CPU to trigger, so it's ok to make it slow]

Only trap is that you have to add the exception table sorting too...

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/