Re: [patch] 32-bit dma memory zone

From: Steffen Persvold (sp@scali.no)
Date: Fri Jun 08 2001 - 03:58:47 EST


"David S. Miller" wrote:
>
> Richard Henderson writes:
> > On most alphas we use only one zone -- ZONE_DMA. The iommu makes it
> > possible to do 32-bit pci to the entire memory space.
> >
> > For those alphas without an iommu, we also set up ZONE_NORMAL.
>
> And on sparc64 since all machines have an iommu, we use just ZONE_DMA
> for everything.
>

And on IA64 they use both ZONE_NORMAL and ZONE_DMA. ZONE_DMA is up to 4GB.

This setup actually makes a PCI device driver I'm writing kind of broken. It allocates
buffers (with get_free_page) for streaming DMA and pass them on to pci_map_sg(). These
buffers can be really large because this is a shared memory adapter where you basically
make large portions (>100MByte) of your memory available to other machines over the PCI
bus. Unfortunately this adapter is not able to do DAC (64bit addressing) so I have to be
sure that the physical memory is within a 32bit range. Bounce buffers is really out of the
question because it will kill my performance. On Alpha (atleast for the Tsunami and
Nautilus models I've looked at) this guaranteed by using either the direct mapped windows
(which limits you to 2GB of physical memory) or the IOMMU scatter-gather windows. On i386
I can't use GFP_DMA because this will only give me memory below 16MByte and that is not
enough for these buffers, but just using the ZONE_NORMAL (no special GFP flag to
get_free_page) memory is fine (BTW, I have not yet understood how a 32bit machine can
access more than 4GB physical memory..). The problem child here is IA64. These machines
may or may not have an IOMMU. If the machine doesn't have an IOMMU (like the 460GX
chipset) and you have a lot of memory (like 2GB) you might get a physical address above
the 4GB boundary which is no good for my 32bit device. The IA64 code fixes this by using
something called Sofwtare I/O TLB, which copies your data to a memory below the 4GB
boundary when you do pci_map_xxx (if direction is DMA_TO_DEVICE), and copies it back when
you do pci_unmap_ (if direction is DMA_FROM_DEVICE). I guess this is what you call bounce
buffers, but since this I/O TLB area is so small by default (4MByte I think) it is no good
either because I will soon run out of Software IO TLB Entries resulting in a kernel panic.
My solution here was to use the GFP_DMA flag to get_free_page to ensure that the page was
below the 4GB boundary.

Now, because of this I need a #ifdef __ia64__ (or maybe I could use #ifndef __i386__ ?) in
my driver but I would really not like to have that there. My suggestion is therefore to
have a ZONE_32BIT (and a corresponding GFP_32BIT flag) to have a common way of ensuring
that the memory you get is guaranteed to be below the 4GB boundary. Actually this is
already mentioned in the IA64 swiotlb_alloc_consistent() code :

if (!hwdev || hwdev->dma_mask <= 0xffffffff)
        gfp |= GFP_DMA; /* XXX fix me: should change this to GFP_32BIT or ZONE_32BIT */
ret = (void *)__get_free_pages(gfp, get_order(size));

Regards,

-- 
  Steffen Persvold               Systems Engineer
  Email : mailto:sp@scali.no     Scali AS (http://www.scali.com)
  Tlf   : (+47) 22 62 89 50      Olaf Helsets vei 6
  Fax   : (+47) 22 62 89 51      N-0621 Oslo, Norway
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jun 15 2001 - 21:00:09 EST