Re: [PATCH 2.6.34] ehci-hcd: add option to enable 64-bit DMAsupport

From: Greg KH
Date: Fri Feb 19 2010 - 23:31:37 EST


On Fri, Feb 19, 2010 at 07:30:30PM -0600, Robert Hancock wrote:
> On Thu, Feb 18, 2010 at 9:54 PM, Greg KH <greg@xxxxxxxxx> wrote:
> > On Thu, Feb 18, 2010 at 09:46:29PM -0600, Robert Hancock wrote:
> >> On Thu, Feb 18, 2010 at 6:47 PM, Greg KH <greg@xxxxxxxxx> wrote:
> >> >> > So you did not measure it?
> >> >> >
> >> >> > Hm, I guess this change must not be necessary :)
> >> >>
> >> >> I'll try and run some tests and see what I can quantify. However, I
> >> >> only have 4GB of RAM on my machine (with a 1GB memory hole) and so a
> >> >> random memory allocation only has a 25% chance of ending up in the
> >> >> area where it would make a difference, so it may take a bit of doing.
> >> >
> >> > Without any good justification, including real tests being run, I can't
> >> > take this patch, the risk is just too high.
> >>
> >> Again, this particular patch has essentially zero risk for anyone that
> >> doesn't choose to experiment with the option. One can hardly say it
> >> presents much of a long-term maintenance burden either..
> >
> > Then don't give them the option, as it doesn't seem needed :)
> >
> > Again, it is tough to remove options once you add them, so not adding
> > them at all is the best thing to do.
>
> I don't know why you would remove the option. Even if you someday
> changed the default to 1, it would likely be a still idea to keep it
> around for debugging purposes at least.
>
> If you're complaining about options, ehci-hcd already has some which
> are quite a bit more nebulous in usefulness than this one..

Yeah, and adding another one, for no known benifit, and a known
detriment for some machines, is not a good idea.

> >> > And really, for USB 2.0 speeds, I doubt you are going to even notice
> >> > this kind of overhead, it's in the noise. ?Especially given that almost
> >> > always the limiting factor is the device itself, not the host.
> >>
> >> Well, I do have some results. This is from running this "dd
> >> if=/dev/sdg of=/dev/null bs=3800M iflag=direct" against an OCZ Rally2
> >> USB flash drive, which gets about 30 MB/sec on read, with CPU-burning
> >> tasks on all cores in the background. (The huge block size and
> >> iflag=direct is to try to force more of the IO to happen to memory
> >> above the 4GB mark.) With that workload, swiotlb_bounce shows up as
> >> between 1.5 to 4% of the CPU time spent in the kernel according to
> >> oprofile. Obviously with the 64-bit DMA enabled, that disappears. Of
> >> course, the overall kernel time is only around 2% of the total time,
> >> so that's a pretty small overall percentage.
> >
> > 2% is noise, right? ?So overall you have not really shown any
> > improvement.
>
> What threshold of performance improvement would you rather see? It's
> pretty clear that there will be a performance upside, even if small,
> and no downside.

I thought the downside was that this would break on some machines. And
debugging this is quite difficult.

> I honestly didn't expect as much resistance to a simple hardware
> feature enablement patch, that has zero impact on anyone that doesn't
> opt-in..

Again, people will opt-in, if we want them to or not, and then if
problems happen, will blame us for it. Determining that they did enable
this option, is quite difficult, right?

> >> I'll try some tests later with a faster SATA-to-IDE device that should
> >> stress things a bit more, but a huge difference doesn't seem likely.
> >> One thing that's uncertain is just how much of the IO is needing to be
> >> bounced - an even distribution of the buffer across all of physical
> >> RAM would suggest 25% in this case, but I don't know an easy way to
> >> verify that.
> >>
> >> Aside from speed considerations though, I should point out another
> >> factor: IOMMU/SWIOTLB space is in many cases a limited resource for
> >> all IO in flight at a particular time (SWIOTLB is typically 64MB). The
> >> number of hits when Googling for "Out of IOMMU space" indicates it is
> >> a problem that people do hit from time to time. From that perspective,
> >> anything that prevents unnecessary use of bounce buffers is a good
> >> thing.
> >
> > Sure, but again, for USB 2.0 stuff, we don't have many I/O in flight, as
> > they are pretty slow devices.
>
> Think that's a bit simplistic, if you have multiple devices active at
> once, or multiple controllers (not at all uncommon these days, newer
> Intel chipset machines have two EHCI controllers, with USB 1.x devices
> handled through a logical hub with TT connected to each of them) that
> can chew up more space.

I see machines with 4+ EHCI controllers quite common, and lots have a
number more than that. But I don't see how that matters here, they
aren't doing RAID over USB :)

> > USB 3.0 is different, and that's a different driver, and hopefully that
> > is all addressed already :)
>
> Doesn't look like it, from the version in current -git anyway - I
> don't see any calls to set DMA masks in the XHCI code so it will just
> default to 32-bit. I imagine that'll hurt performance at 4.8 Gbps if
> you've got lots of RAM..

Ok, we can worry about that when we get there, so far there is almost no
USB 3.0 devices out there to test with.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/