Re: [PATCH] block: fix q->max_segment_size checking in blk_recalc_rq_segmentsabout VMERGE

From: Mikulas Patocka
Date: Thu Jul 24 2008 - 17:49:45 EST


On Thu, 24 Jul 2008, James Bottomley wrote:

> On Thu, 2008-07-24 at 12:34 -0400, Mikulas Patocka wrote:
> > On Thu, 24 Jul 2008, James Bottomley wrote:
> >
> > > On Thu, 2008-07-24 at 11:07 -0400, Mikulas Patocka wrote:
> > > > So try to #define BIO_VMERGE_BOUNDARY 0 for Pa-Risc and tell us what
> > > > performance degradation do you see (and what driver do you use and what is
> > > > the I/O pattern).
> > > >
> > > > If you show something specific, we can consider that --- but you haven't
> > > > yet told us anything, except generic talk.
> > >
> > > You keep ignoring inconvenient facts. For about the third time:
> > >
> > > I run a test bed for sg_tables (large chaining of requests). This runs
> > > on parisc using virtual merging (has to because the final physical table
> > > size can't go over the sg list of the SCSI card). If I turn off virtual
> > > merging I can no longer test sg_tables in vanilla kernels.
> > >
> > > James
> >
> > What sg_tables test do you mean? What does the test do? Why couldn't you
> > run the test if BIO_VMERGE_BOUNDARY is 0? Normal I/O obviously can work
> > with BIO_VMERGE_BOUNDARY 0, the kernel will just send more smaller
>
> Look, if you don't really understand what I'm doing, it's not really my
> job to educate you. The sg_table discussions are on marc.info, mainly
> on the SCSI lists; just look for 'sg chaining' in the header (need to
> use google site ... marc's search is bad).
>
> You can complain if the code is impacting you ... but I believe I've
> optimised it so it isn't. Your basic problem amounts to you not liking
> me doing something that has no impact on you ... I'm afraid that's what
> freedom leads to (shocking, I know).
>
> James

Chaining of sg_tables is used for drivers with big sg tables --- and
vmerge counting is used for drivers with small sg tables. So what do they
have in common?

Summary, what I mean:

* in blk-merge.c, you have 85 lines, that is 16% of the size of the file,
devoted to counting of hw_segments

* it is only used on two architectures, one already outdated (alpha), the
other being discontinued (pa-risc). On all the other architectures,
hw_segments == phys_segments

* it is prone to bugs and hard to maintain, because the same value must be
calculated in blk-merge.c and in architectural iommu functions --- if the
value differs, you create too long request, corrupt kernel memory and
crash (happened on sparc64). Anyone changing blk-merge in the future will
risk breaking something on the architectures that use BIO_VMERGE_BOUNDARY
--- and because these architectures are so rare, the bug will go unnoticed
for long time --- like in the case of sparc64.

* you are just talking how this code is important for performance without
showing any single proof that it really is (temporarily disable
hw_segments accounting by defining BIO_VMERGE_BOUNDARY 0 and get the
numbers).

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/