Re: sparc64 pcpu failures...

From: Tejun Heo
Date: Fri Sep 18 2009 - 04:38:29 EST


Hello, David.

David Miller wrote:
> From: David Miller <davem@xxxxxxxxxxxxx>
> Date: Thu, 17 Sep 2009 22:18:07 -0700 (PDT)
>
>> Tejun, I just started seeing the following on sparc64:
>>
>> [ 56.422005] WARNING: at mm/vmalloc.c:1991 pcpu_get_vm_areas+0x1b4/0x5fc()
> ...
>> Might this be a result of:
>>
>> commit bcb2107fdbecef3de55d597d23453747af81ba88
>> Author: Tejun Heo <tj@xxxxxxxxxx>
>> Date: Fri Aug 14 15:00:53 2009 +0900
>>
>> sparc64: use embedding percpu first chunk allocator
>>
>> sparc64 currently allocates a large page for each cpu and partially
>> remap them into vmalloc area much like what lpage first chunk
>> allocator did. As a 4M page is used for each cpu, this results in
>> very large unit size and also adds TLB pressure due to the double
>> mapping of pages in the first chunk.
>>
>> This patch converts sparc64 to use the embedding percpu first chunk
>> allocator which now knows how to handle NUMA configurations. This
>> simplifies the code a lot, doesn't incur any extra TLB pressure and
>> results in better utilization of address space.
>>
>> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
>> Acked-by: David S. Miller <davem@xxxxxxxxxxxxx>
>>
>> Do you think?
>
> I've verified that reverting this makes the problem go away.

Yes, this would be the result of the new congruent sparse allocator.
Heh... I didn't really expect that WARN_ON() to actually trigger.
What it means is that the farthest percpu units were too far to fit
into vmalloc area so percpu chunks couldn't be allocated in the
vmalloc area. Checking... vmalloc area on sparc64 is only 4G so yeap
that is entirely possible.

Hmmm... the congruent sparse allocator assumes vmalloc area to be
larger with some margin than the farthest physical memory nodes. On
x86, powerpc and ia64, this holds with sufficient margin. Would it be
possible to modify sparc64 memory layout so that the assumption can be
satisfied? If that's not possible, going back to lpage is an option
but in many ways congruent sparse allocator is better.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/