Re: [RFC] block layer support for DMA IOMMU bypass mode

From: Grant Grundler (
Date: Tue Jul 01 2003 - 14:19:41 EST

On Tue, Jul 01, 2003 at 11:46:12AM -0500, James Bottomley wrote:
> However, a fair number come with
> the caveat that actually using the IOMMU is expensive.

IOMMU mapping is slightly more expensive than direct physical on HP boxes.
(yes davem, you've told me how wonderful sparc IOMMU is ;^)
But obviously alot less expensive than bounce buffers.

> The Problem:
> At the moment, the block layer assumes segments may be virtually
> mergeable (i.e. two phsically discondiguous pages may be treated as a
> single SG entity for DMA because the IOMMU will patch up the
> discontinuity) if an IOMMU is present in the system.

The symptom is drivers which have limits on DMA entries will return
errors (or crash) when the IOMMU code doesn't actually merge as much
as the BIO code expected.

Specifically, sym53c8xx_2 only takes 96 DMA entries per IO and davidm
hit that pretty easily on ia64.
MPT/Fusion (LSI u32) doesn't seem to have a limit.
IDE limit is PAGE_SIZE/8 (or 16k/8=2k for ia64).
I haven't checked other drivers.

> +/* Is the queue dma_mask eligible to be bypassed */
> +#define __BIO_CAN_BYPASS(q) \

Like Andi, I had suggested a callback into IOMMU code here.
But I'm pretty sure james proposal will work for ia64 and parisc.

Ideally, I don't like to see two seperate chunks of code performing
the "let's see what I can merge now" loops. Instead, BIO could merge
"intra-page" segments and call the IOMMU code to "merge" remaining
"inter-page" segments. IOMMU code needs to know how many physical entries
are allowed (when to stop processing) and could return the number of
sg list entries it was able to merge.

thanks james!
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Mon Jul 07 2003 - 22:00:14 EST