Re: [PATCH] Fix early_ioremap on x86-64

From: Ingo Molnar
Date: Sun Jan 20 2008 - 13:00:43 EST



* Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:

> Fix early_ioremap on x86-64
>
> [Venki, sorry for blaming PAT for this earlier. It was innocent.]
> [Note the patch is on top of git-x86 + gbpages applied. I think there
> will be an trivial reject without gbpages-direct applied first]
>
> I had ACPI failures on several machines since a few days. Symptom was
> NUMA nodes not getting detected or worse cores not getting detected.
> They all came from ACPI not being able to read various of its tables.
> I finally bisected it down to Jeremy's "put _PAGE_GLOBAL into
> PAGE_KERNEL" change. With that the fix was fairly obvious. The problem
> was that early_ioremap() didn't use a "_all" flush that would affect
> the global PTEs too. So with global bits getting used everywhere now
> an early_ioremap would not actually flush a mapping if something else
> was mapped previously on that slot (which can happen with
> early_iounmap inbetween)
>
> This patch changes all flushes in init_64.c to be __flush_tlb_all()
> and fixes the problem here.

ah, very nice! This is the bad commit:

------------------------->
Subject: x86: fold _PAGE_GLOBAL into __PAGE_KERNEL
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>

With the iounmap problem resolved, it should be OK to always set
_PAGE_GLOBAL in __PAGE_KERNEL*.

[ Did this patch cause problems before? ]
<------------------------

Jeremy did suspect something about this change, as indicated in the
changelog. But because the change was so finegrained, the bisection
almost directly led to the fix. _That_ i think clearly demonstrates the
power of bisection and finegrained changes.

but note what the fundamental problem is - we've turned a previously
safe flushing API into an unsafe one - __flush_tlb() will only be safe
in the rarest of circumstances. There are some other matches:

./mm/init_64.c: __flush_tlb();
./kernel/head64.c: __flush_tlb();

The boot identity mappings zapped do not have PGE set at the moment, but
they could in the future (once we do native pagetable setup straight
from flat mode) - and this is not a performance critical path anyway.

./kernel/cpu/mtrr/generic.c: __flush_tlb();
./kernel/cpu/mtrr/generic.c: __flush_tlb();

these include an open-coded version of __flush_tlb_all() so they are
safe.

and we might as well make the non-PGE flush the 'special API'. I.e.
rename __flush_tlb() to __flush_tlb_partial() and rename
__flush_tlb_all() to __flush_tlb(). This makes it very apparent which
should be used by default and which does what.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/