Re: TTM page pool allocator

From: Dave Airlie
Date: Thu Jul 09 2009 - 02:06:34 EST


2009/6/30 Thomas Hellström <thomas@xxxxxxxxxxxx>:
> Jerome Glisse skrev:
>>
>> On Fri, 2009-06-26 at 10:00 +1000, Dave Airlie wrote:
>>
>>>
>>> On Thu, Jun 25, 2009 at 10:01 PM, Jerome Glisse<glisse@xxxxxxxxxxxxxxx>
>>> wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> Thomas i attach a reworked page pool allocator based on Dave works,
>>>> this one should be ok with ttm cache status tracking. It definitely
>>>> helps on AGP system, now the bottleneck is in mesa vertex's dma
>>>> allocation.
>>>>
>>>>
>>>
>>> My original version kept a list of wb pages as well, this proved to be
>>> quite a useful
>>> optimisation on my test systems when I implemented it, without it I
>>> was spending ~20%
>>> of my CPU in getting free pages, granted I always used WB pages on
>>> PCIE/IGP systems.
>>>
>>> Another optimisation I made at the time was around the populate call,
>>> (not sure if this
>>> is what still happens):
>>>
>>> Allocate a 64K local BO for DMA object.
>>> Write into the first 5 pages from userspace - get WB pages.
>>> Bind to GART, swap those 5 pages to WC + flush.
>>> Then populate the rest with WC pages from the list.
>>>
>>> Granted I think allocating WC in the first place from the pool might
>>> work just as well since most of the DMA buffers are write only.
>>>
>>> Dave.
>>> --
>>>
>>
>> Attached a new version of the patch, which integrate changes discussed.
>>
>> Cheers,
>> Jerome
>>
>
> Hi, Jerome!
> Still some outstanding things:
>
> 1) The AGP protection fixes compilation errors when AGP is not enabled, but
> what about architectures that need the map_page_into_agp() semantics for TTM
> even when AGP is not enabled? At the very least TTM should be disabled on
> those architectures. The best option would be to make those calls non-agp
> specific.
>
> 2) Why is the page refcount upped with get_page() after an alloc_page()?
>
> 3) It seems like pages are cache-transitioned one-by-one when freed. Again,
> this is a global TLB flush per page. Can't we free a large chunk of pages at
> once?
>

Jerome,

have we addressed these?

I'd really like to push this soon, as I'd like to fix up the 32 vs 36
bit dma masks if possible
which relies on us being able to tell the allocator to use GFP_DMA32 on some hw
(32-bit PAE mainly with a PCI card).

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/