Re: [PATCH 1/5] freepgt: free_pgtables use vma list

From: David S. Miller
Date: Tue Mar 22 2005 - 17:49:12 EST


On Tue, 22 Mar 2005 21:51:39 +0000 (GMT)
Hugh Dickins <hugh@xxxxxxxxxxx> wrote:

> I still can't see what's wrong with the code that's already
> there. My brain is seizing up, I'm taking a break.

Ok, meanwhile I'll do a brain dump of what I think this
code should be doing.

Let's take an example free_pgd_range() call. Say the
address parameters are:

addr 0x10000
end 0xa4000
floor 0x00000
ceiling 0xb2000

(This example comes from my exit_mmap() VMA dump earlier
in this thread. If you disable the VMA skipping optimization
the first call to free_pgd_range() has these parameters.)

What ought this free_pgd_range() call do? This range of
addresses, from floor to ceiling, is smaller than a PMD_SIZE
(which on sparc64 is 1 << 23). Therefore it should clear
no PGD or PUD entries.

Yet, it does clear them, specifically:

free_pgd_range():
1) mask addr (0x10000) to PMD_MASK, addr is now 0
2) addr < floor (0x00000) test does not pass
3) mask ceiling (0xb2000) to PMD_MASK, ceiling is now 0 too
4) end - 1 > ceiling - 1 test does not pass
5) addr > end - 1 test does not pass either
6) We now loop one PGDIR_SIZE at a time from
addr (0x00000) to end (0xa4000), calling
down into...
free_pud_range():
1) addr=0, end=0xa4000, floor=0, ceiling=0
2) We loop one PUD_SIZE at a time from
addr (0x00000) to end (0xa4000), calling
down into...
free_pmd_range():
1) addr=0, end=0xa4000, floor=0, ceiling=0
2) We loop one PMD_SIZE at a time from
addr (0x00000) to end (0xa4000), calling
down into...
free_pte_range():

And later when we finish the loops in free_pmd_range()
and free_pud_range() we do pud_clear() and pgd_clear()
respectively, both wrong.

The source of the problems seems to be how ceiling began
at the top of the call chain as 0xb2000, but when we
masked it with PMD_MASK that set it to zero, which means
"top of address space" in these functions. That's not
what we want.

I added a quick hack to the simulator I posted, where
we mask ceiling in free_pgd_range(), I do it like this:

if (ceiling) {
ceiling &= PMD_MASK;
if (!ceiling)
return;
}

and things seem to behave. I'll try to analyze things
further and test this out on a real kernel, but all of
these adjustments at the top of free_pgd_range() really
start to look like pure spaghetti. :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/