Re: [BUG] Regression introduced with "block: split bios to max possible length"

From: Jens Axboe
Date: Fri Jan 22 2016 - 12:15:35 EST


On 01/22/2016 07:56 AM, Keith Busch wrote:
On Thu, Jan 21, 2016 at 08:15:37PM -0800, Linus Torvalds wrote:
For the case of nvme, for example, I think the max sector number is so
high that you'll never hit that anyway, and you'll only ever hit the
chunk limit. No?

The device's max transfer and chunk size are not very large, both fixed
at 128KB. We can lose ~70% of potential throughput when IO isn't aligned,
and end users reported this when the block layer stopped splitting on
alignment for the NVMe drive.

So it's a big deal for this h/w, but now I feel awkward defending a
device specific feature for the generic block layer.

Honestly, the splitting code is what is a piece of crap, we never should have gone down that route. Hopefully we can get rid of it soon. In the mean time, this does need to work. It's an odd hw construct (basically two devices bolted together), but it's not really an esoteric thing to support.

Anyway, the patch was developed with incorrect assumptions. I'd still
like to try again after reconciling the queue limit constraints, but
I defer to Jens for the near term.

Instead of scrambling for -rc1, I'd suggest we just revert again and ensure what we merge for -rc2 is clean and passes the test cases.

--
Jens Axboe