Re: IOMMU and scatterlist limits

From: Pierre Ossman
Date: Tue Dec 20 2005 - 06:36:24 EST


Tejun Heo wrote:
> Pierre Ossman wrote:
>> Revisiting a dear old thread. :)
>>
>> After some initial tests, some more questions popped up. See below.
>>
>> Jens Axboe wrote:
>>
>>> On Thu, Nov 17 2005, Pierre Ossman wrote:
>>>
>>>
>>>> Since there is no guarantee this will be mapped down to one segment
>>>> (that the hardware can accept), is it expected that the driver
>>>> iterates
>>>> over the entire list or can I mark only the first segment as completed
>>>> and wait for the request to be reissued? (this is a MMC driver, which
>>>> behaves like the block layer)
>>>>
>>>
>>> Ah MMC, that explains a few things :-)
>>>
>>> It's quite legal (and possible) to partially handle a given request,
>>> you
>>> are not obliged to handle a request as a single unit. See how other
>>> block drivers have an end request handling function ala:
>>>
>>>
>>
>>
>> After testing this it seems the block layer never gives me more than
>> max_hw_segs segments. Is it being clever because I'm compiling for a
>> system without an IOMMU?
>>
>> The hardware should (haven't properly tested this) be able to get new
>> DMA addresses during a transfer. In essence scatter gather with some CPU
>> support. Since I avoid MMC overhead this should give a nice performance
>> boost. But this relies on the block layer giving me more than one
>> segment. Do I need to lie in max_hw_segs to achieve this?
>>
>
> Hi, Pierre.
>
> max_phys_segments: the maximum number of segments in a request
> *before* DMA mapping
>
> max_hw_segments: the maximum number of segments in a request
> *after* DMA mapping (ie. after IOMMU merging)
>
> Those maximum numbers are for block layer. Block layer must not
> exceed above limits when it passes a request downward. As long as all
> entries in sg are processed, block layer doesn't care whether sg
> iteration is performed by the driver or hardware.
>
> So, if you're gonna perform sg by iterating in the driver, what
> numbers to report for max_phys_segments and max_hw_segments is
> entirely upto how many entries the driver can handle.
>
> Just report some nice number (64 or 128?) for both. Don't forget that
> the number of sg entries can be decreased after DMA-mapping on
> machines with IOMMU.
>
> IOW, the part which performs sg iteration gets to determine above
> limits. In your case, the driver is reponsible for both iterations
> (pre and post DMA mapping), so all the limits are upto the driver.
>
>

I'm still a bit confused why the block layer needs to know the maximum
number of hw segments. Different hardware might be connected to
different IOMMU:s, so only the driver will now how much the number can
be reduced. So the block layer should only care about not going above
max_phys_segments, since that's what the driver has room for.

What is the scenario that requires both?

Rgds
Pierre

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/