Re: [PATCH] mm/nommu.c:Dynamic alloc/free percpu area for nommu

From: graff yang
Date: Sun Mar 21 2010 - 22:34:05 EST


On Sat, Mar 20, 2010 at 12:06 PM, Tejun Heo <tj@xxxxxxxxxx> wrote:
> Hello,
>
> On 03/19/2010 06:02 PM, graff.yang@xxxxxxxxx wrote:
>>
>> From: Graff Yang<graff.yang@xxxxxxxxx>
>>
>> This patch supports dynamic alloc/free percpu area for nommu arch like
>> blackfin.
>> It allocates contiguous pages in funtion pcpu_get_vm_areas() instead of
>> getting none contiguous pages then vmap it in mmu arch.
>> As we can not get the real page structure through vmalloc_to_page(), so
>> it also modified the nommu version vmalloc_to_page()/vmalloc_to_pfn().
>>
>> Signed-off-by: Graff Yang<graff.yang@xxxxxxxxx>
>
> Heh heh... I've never imagined there would be a SMP architecture w/o
> mmu. ÂThat's pretty interesting. ÂI mean, there is real estate for
> multiple cores but not for mmu?

Yes, we ported the SMP to the blackfin dual core processor BF561.

>
>> diff --git a/mm/nommu.c b/mm/nommu.c
>> index 605ace8..98bbdf4 100644
>> --- a/mm/nommu.c
>> +++ b/mm/nommu.c
>> @@ -255,13 +255,15 @@ EXPORT_SYMBOL(vmalloc_user);
>>
>> Âstruct page *vmalloc_to_page(const void *addr)
>> Â{
>> - Â Â Â return virt_to_page(addr);
>> + Â Â Â return (struct page *)
>> + Â Â Â Â Â Â Â Â Â Â Â (virt_to_page(addr)->index) ? :
>> virt_to_page(addr);
>
> Nothing major but isn't it more usual to write ?: without the
> intervening space?
>
>> +#ifdef CONFIG_SMP
>> +int map_kernel_range_noflush(unsigned long addr, unsigned long size,
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â pgprot_t prot, struct page
>> **pages)
>> +{
>
> More nitpicks.
>
>> + Â Â Â int i, nr_page = size>> ÂPAGE_SHIFT;
>
> Â Â Â Â Â Â Â nr_pages = size >> PAGE_SHIFT;
>
>> + Â Â Â for (i = 0; i< Ânr_page; i++, addr += PAGE_SIZE)
>
> Â Â Â Â Â Â Â Â Â Âi < nr_pages
>
>> + Â Â Â Â Â Â Â virt_to_page(addr)->index = (pgoff_t)pages[i];
>> + Â Â Â return size>> ÂPAGE_SHIFT;
>
> Â Â Â Âreturn size >> PAGE_SHIFT;
>
> I think checkpatch would whine about these too.

OK.

>
>> +void unmap_kernel_range_noflush(unsigned long addr, unsigned long size)
>> +{
>> + Â Â Â int i, nr_page = size>> ÂPAGE_SHIFT;
>> + Â Â Â for (i = 0; i< Ânr_page; i++, addr += PAGE_SIZE)
>> + Â Â Â Â Â Â Â virt_to_page(addr)->index = 0;
>> +}
>> +
>> +struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â const size_t *sizes, int nr_vms,
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â size_t align, gfp_t
>> gfp_mask)
>
> Hmmm... in general, one of the reasons the percpu allocation is
> complex is to avoid contiguous allocations while avoiding additional
> TLB / NUMA overhead on machines with rather complex memory
> configuration (which is pretty common these days). ÂIf the memory has
> to be allocated contiguous anyway, it probably would be much simpler
> to hook at higher level and simply allocate each chunk contiguously.
> I'll look into it.
I understand the complexity of percpu allocation code. As a nommu arch,
we have to allocate a bulk of memory in one time to insure its contiguous.
And in my implementation, many pages are wasted.
It would be better, if the percpu allocation code provide some hooks for us.
Thanks for your feedback.

--
-Graff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/