Re: 4level page tables for Linux

From: Andi Kleen
Date: Tue Oct 12 2004 - 14:13:05 EST


On Tue, Oct 12, 2004 at 11:48:22AM -0700, Dave Hansen wrote:
> @@ -110,13 +115,18 @@ int install_file_pte(struct mm_struct *m
> unsigned long addr, unsigned long pgoff, pgprot_t prot)
> {
> ...
> + pml4 = pml4_offset(mm, addr);
> +
> + spin_lock(&mm->page_table_lock);
> + pgd = pgd_alloc(mm, pml4, addr);
> + if (!pgd)
> + goto err_unlock;
>
> Locking isn't needed for access to the pml4? This is a wee bit
> different from pgd's and I didn't see any documentation about it
> anywhere. Could be confusing.

No, the lock is still needed. Thanks for catching this, that was indeed
wrong.

>
> +++ linux-2.6.9rc4-4level/mm/memory.c
> ...
> +#undef inline
> +#define inline
> +unsigned long caddr;
>
> Is this just for debugging?

Yes, that's a leftover I forgot to remove. Will go in the next
version.

(btw t here is another leftover in there, I will remove it too)

>
> +static inline void free_one_pml4(struct mmu_gather *tlb, pml4_t *pml4,
> + unsigned long addr, unsigned long end)
> +{
> ...
> + do {
> + caddr = addr;
> + free_one_pgd(tlb, pgd);
> + free++;
> + addr = (addr + PGDIR_SIZE) & PGDIR_MASK;
> + pgd++;
> + } while (addr && addr < end);
>
> If someone attempts to clear an address which is in the top PGDIR_SIZE
> bytes of memory, this will overflow. Is that an issue?

That is what the addr && is for. Yes, overflows happen and afaik they
are all handled.

>
> There also seems to be quite a bit of churn in the copy_*_range()
> functions that isn't completely related to the pml4 changes. Should
> that get broken out?

It's related. The old function just wasn't scalable to 4 level
page tables at all (i really tried but it was too ugly) so I rewrote it
completely. Splitting it would be difficult because the 3level version
would be already very different from the 4level version.

Thanks for the review.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/