Re: [PATCH] fs: System memory leak when running HTX with T10 DIF enabled

From: Jens Axboe
Date: Wed Jun 28 2017 - 15:00:20 EST


On 06/28/2017 12:52 PM, Christoph Hellwig wrote:
> On Wed, Jun 28, 2017 at 12:44:00PM -0600, Jens Axboe wrote:
>> On 06/28/2017 12:38 PM, Christoph Hellwig wrote:
>>> On Wed, Jun 28, 2017 at 12:34:15PM -0600, Jens Axboe wrote:
>>>> That's what I sent out.
>>>
>>> Where? Didn't see that anywhere..
>>
>> Looks like you weren't CC'ed on the original thread. About an hour ago.
>>
>>>> Here it is again. We should get this into 4.12,
>>>> so would be great with a review or two.
>>>
>>> Can we rename __bio_free to bio_uninit and add a comment to bio_init
>>> that it must be paried with bio_uninit?
>>
>> Let's keep it small for 4.12. We can do a cleanup on top of this for
>> 4.13.
>
> The rename is two additional lines for the patch, it's not going to
> make a difference..

Sure, why not...

>>> Except for that this looks fine, although there are a lot more callers
>>> that should get this treatment..
>>
>> Should only be an issue for on-stack bio's, since we don't go through
>> the put/free path. Did a quick grep, looks like this is one of 3. One
>> is floppy, which probably neither has DIF or uses blk-throttle. Then
>> there's one in dm-bufio, didn't look too closely at that. Last one is
>> this one.
>
> Well, it's really all callers but bio_alloc_bioset itself that
> will need this handling, as only bios that come from bio_alloc_bioset
> will be freed through bio_free. Most of them probably don't
> support DIF, but they'll also miss the bio_disassociate_task call
> this way, and will leak I/O context and css references if block
> cgroup support is enabled.

I guess local allocs too will be affected.

I'm baffled we've had this issue for so long, and nobody's seen it.
I knew DIF was never used (basically), but blk-throttle must have some
users at least.



diff --git a/block/bio.c b/block/bio.c
index 888e7801c638..f877d5e6d17f 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -240,7 +240,7 @@ struct bio_vec *bvec_alloc(gfp_t gfp_mask, int nr, unsigned long *idx,
return bvl;
}

-static void __bio_free(struct bio *bio)
+void bio_uninit(struct bio *bio)
{
bio_disassociate_task(bio);

@@ -253,7 +253,7 @@ static void bio_free(struct bio *bio)
struct bio_set *bs = bio->bi_pool;
void *p;

- __bio_free(bio);
+ bio_uninit(bio);

if (bs) {
bvec_free(bs->bvec_pool, bio->bi_io_vec, BVEC_POOL_IDX(bio));
@@ -271,6 +271,11 @@ static void bio_free(struct bio *bio)
}
}

+/*
+ * Users of this function have their own bio allocation. Subsequently,
+ * they must remember to pair any call to bio_init() with bio_uninit()
+ * when IO has completed, or when the bio is released.
+ */
void bio_init(struct bio *bio, struct bio_vec *table,
unsigned short max_vecs)
{
@@ -297,7 +302,7 @@ void bio_reset(struct bio *bio)
{
unsigned long flags = bio->bi_flags & (~0UL << BIO_RESET_BITS);

- __bio_free(bio);
+ bio_uninit(bio);

memset(bio, 0, BIO_RESET_BYTES);
bio->bi_flags = flags;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 519599dddd36..0a7404ef9335 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -263,7 +263,10 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
kfree(vecs);

if (unlikely(bio.bi_error))
- return bio.bi_error;
+ ret = bio.bi_error;
+
+ bio_uninit(&bio);
+
return ret;
}

diff --git a/include/linux/bio.h b/include/linux/bio.h
index d1b04b0e99cf..a7e29fa0981f 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -426,6 +426,7 @@ extern void bio_advance(struct bio *, unsigned);

extern void bio_init(struct bio *bio, struct bio_vec *table,
unsigned short max_vecs);
+extern void bio_uninit(struct bio *);
extern void bio_reset(struct bio *);
void bio_chain(struct bio *, struct bio *);


--
Jens Axboe