Re: [PATCH BUGFIX V2 0/3] null_blk: fix throughput losses and hangs

From: Jens Axboe
Date: Tue Dec 01 2015 - 12:53:07 EST


On 12/01/2015 03:48 AM, Paolo Valente wrote:
Hi,
here is an updated version of the patchset, differing from the
previous version only in that it reinstates the missing extra check
pointed out in [2]. For your convenience, the content of the cover
letter for the previous version follows.

While doing some tests with the null_blk device driver, we bumped into
two problems: first, unjustified and in some cases high throughput
losses; second, actual hangs. These problems seem to be the
consequence of the combination of three causes, and this patchset
introduces a fix for each of these causes. In particular, changes
address:
. an apparent flaw in the logic with which delayed completions are
implemented: this flaw causes, with unlucky but non-pathological
workloads, actual request-completion delays to become arbitrarily
larger than the configured delay;
. the missing restart of the device queue on the completion of a request in
single-queue non-delayed mode;
. the overflow of the request-delay parameter, when extremely high values
are used (e.g., to spot bugs).

To avoid possible confusion, we stress that these fixes *do not* have
anything to do with the problems highlighted in [1] (tests of the
multiqueue xen-blkfront and xen-blkback modules with null_blk).

You can find more details in the patch descriptions.

Thanks Paolo, added.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/