Re: [PATCH] x86: Construct 32 bit boot time page tables in nativeformat.

From: Ian Campbell
Date: Sun Jan 20 2008 - 11:45:30 EST



On Sun, 2008-01-20 at 00:07 +0100, Andi Kleen wrote:
> Ian Campbell <ijc@xxxxxxxxxxxxxx> writes:
> > +#ifdef CONFIG_X86_PAE
> > +err_no_pae:
> > + /* It is probably too early but we might as well try... */
>
> Without a low identity mapping early_printk will not work and printk
> definitely not.
>
> > +#ifdef CONFIG_PRINTK
>
> You should do the test in the 16 bit boot code. In fact it should
> already do it by testing the CPUID REQUIRED_MASK.

Indeed it does. I don't have any non-PAE to test it but I turned the
failure case into a simple jmp to hlt_loop since we ought never to get
here in any case.

> > +/*
> > + * Since a paravirt guest will never come down this path we want
> > + * native style page table accessors here.
> > + */
> > +#undef CONFIG_PARAVIRT
>
> Seems quite fragile. I'm sure that would hurt later.

The problem here is that we explicitly want native accessors because
it's too early to use the pv ops since we are still running P==V. A PV
kernel boot will never come down this path -- it is diverted earlier in
head_32.S so using the native versions are appropriate.

I'll try again to use the native_{make,set}_xxx functions but originally
I found the necessary variants weren't defined in all combinations of
PAE/not and PARAVIRT/not.

FWIW we use the same undef trick under arch/x86/boot too and this early
start of day stuff if fairly similar.

> > +static inline __init pte_t *early_pte_offset(pmd_t *pmd, unsigned long vaddr)
> > +{
> > + return ((pte_t *)(u32)(pmd_val(*pmd) & PAGE_MASK))
>
> That will break if the kernel is > 4GB won't it? Also same for pmd.

As hpa says we can't be above 4G at this point. Probably I can use some
variant of make_pte now though.

> > +static inline __init pmd_t *
> > +early_pmd_alloc(pgd_t *pgd_base, unsigned long vaddr, unsigned long *end)
> > +{
> > + pud_t *pud = early_pud_offset(pgd_base, vaddr);
> > +
> > +#ifdef CONFIG_X86_PAE
> > + if (!(pud_val(*pud) & _PAGE_PRESENT)) {
>
> Why not set it in the pgd which is identical? Also the proper test is !pgd_none()

I was trying to fit in with the native_foo stuff that is available and
happened to be using pud on my last attempt before I switched to the
#undef CONFIG_PARAVIRT approach. I'll switch to pgd if I can get it to
work.

pgd_none (and pud_none) are hardcoded to 0 in the 32 bit case (in
asm-generic/pgtable-nopud.h and asm-generic/pgtable-nopmd.h or
asm-x86/pgtable-3level.h). Presumably this is because at regular runtime
these entries are guaranteed to exist which isn't true this early at
startup.

In fact since we are always going to need a PMD in the PAE case there is
probably not much wrong with simply unconditionally allocating the pmd
at the start of early_pgtable_init().

> > +{
> > + pmd_t *pmd;
> > +
> > + pmd = early_pmd_alloc(pgd_base, vaddr, end);
> > + if (!(pmd_val(*pmd) & _PAGE_PRESENT)) {
>
> !pmd_none()

done (without the !)

> > +void __init early_pgtable_init(void)
> > +{
> > + unsigned long addr, end;
> > + pgd_t *pgd_base;
> > +
> > + pgd_base = __pavar(swapper_pg_dir);
> > + end = __pa_symbol(pg0);
>
> Are you sure there will be enough memory here? You might need to use
> an e820 allocator similar to x86-64.

True. However the assembly being replaced makes the same assumptions so
I don't think that should block this patch, it's a fixup that can be
made later.

> > - if (!*pte & _PAGE_PRESENT) {
> > - phys = *pte & PAGE_MASK;
> > + if (!(pte_val(*pte) & _PAGE_PRESENT)) {
>
> pte_present(). Ok the old code was wrong too, but no need to do that again.

Done.

> > @@ -298,7 +304,8 @@ void __init early_ioremap_reset(void)
> > static void __init __early_set_fixmap(enum fixed_addresses idx,
> > unsigned long phys, pgprot_t flags)
> > {
> > - unsigned long *pte, addr = __fix_to_virt(idx);
> > + unsigned long addr = __fix_to_virt(idx);
> > + pte_t *pte;
>
> Unrelated?

Nope, the return type of early_ioremap_pte() changed unsigned long ->
pte_t and that is what is assigned to pte.

I'll spin another version.

Ian.
--
Ian Campbell

"I go on working for the same reason a hen goes on laying eggs."
-- H. L. Mencken

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/