Re: [PATCH] Avoiding fragmentation through different allocator

From: Marcelo Tosatti
Date: Mon Jan 24 2005 - 14:36:08 EST


On Mon, Jan 24, 2005 at 10:44:12AM -0600, James Bottomley wrote:
> On Mon, 2005-01-24 at 10:29 -0200, Marcelo Tosatti wrote:
> > Since the pages which compose IO operations are most likely sparse (not physically contiguous),
> > the driver+device has to perform scatter-gather IO on the pages.
> >
> > The idea is that if we can have larger memory blocks scatter-gather IO can use less SG list
> > elements (decreased CPU overhead, decreased device overhead, faster).
> >
> > Best scenario is where only one sg element is required (ie one huge physically contiguous block).
> >
> > Old devices/unprepared drivers which are not able to perform SG/IO
> > suffer with sequential small sized operations.
> >
> > I'm far away from being a SCSI/ATA knowledgeable person, the storage people can
> > help with expertise here.
> >
> > Grant Grundler and James Bottomley have been working on this area, they might want to
> > add some comments to this discussion.
> >
> > It seems HP (Grant et all) has pursued using big pages on IA64 (64K) for this purpose.
>
> Well, the basic advice would be not to worry too much about
> fragmentation from the point of view of I/O devices. They mostly all do
> scatter gather (SG) onboard as an intelligent processing operation and
> they're very good at it.

So is it valid to affirm that on average an operation with one SG element pointing to a 1MB
region is similar in speed to an operation with 16 SG elements each pointing to a 64K
region due to the efficient onboard SG processing?

> No one has ever really measured an effect we can say "This is due to the
> card's SG engine". So, the rule we tend to follow is that if SG element
> reduction comes for free, we take it. The issue that actually causes
> problems isn't the reduction in processing overhead, it's that the
> device's SG list is usually finite in size and so it's worth conserving
> if we can; however it's mostly not worth conserving at the expense of
> processor cycles.
>
> The bottom line is that the I/O (block) subsystem is very efficient at
> coalescing (both in block space and in physical memory space) and we've
> got it to the point where it's about as efficient as it can be. If
> you're going to give us better physical contiguity properties, we'll
> take them, but if you spend extra cycles doing it, the chances are
> you'll slow down the I/O throughput path.

OK! thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/