Re: [PATCH v7 0/3] mm/slub: extend redzone check for kmalloc objects

From: Feng Tang
Date: Mon Nov 28 2022 - 00:46:55 EST


On Wed, Nov 23, 2022 at 10:48:50AM +0100, Vlastimil Babka wrote:
> On 11/21/22 07:38, Feng Tang wrote:
> > On Fri, Nov 11, 2022 at 04:29:43PM +0800, Tang, Feng wrote:
> >> On Fri, Nov 11, 2022 at 04:16:32PM +0800, Vlastimil Babka wrote:
> >> > > for (shift = 3; shift <= 12; shift++) {
> >> > > size = 1 << shift;
> >> > > buf = kmalloc(size + 4, GFP_KERNEL);
> >> > > /* We have 96, 196 kmalloc size, which is not power of 2 */
> >> > > if (size == 64 || size == 128)
> >> > > oob_size = 16;
> >> > > else
> >> > > oob_size = size - 4;
> >> > > memset(buf + size + 4, 0xee, oob_size);
> >> > > kfree(buf);
> >> > > }
> >> >
> >> > Sounds like a new slub_kunit test would be useful? :) doesn't need to be
> >> > that exhaustive wrt all sizes, we could just pick one and check that a write
> >> > beyond requested kmalloc size is detected?
> >>
> >> Just git-grepped out slub_kunit.c :), will try to add a case to it.
> >> I'll also check if the case will also be caught by other sanitizer
> >> tools like kasan/kfence etc.
> >
> > Just checked, kasan has already has API to disable kasan check
> > temporarily, and I did see sometime kfence can chime in (4 out of 178
> > runs) so we need skip kfenced address.
> >
> > Here is the draft patch, thanks!
> >
> > From 45bf8d0072e532f43063dbda44c6bb3adcc388b6 Mon Sep 17 00:00:00 2001
> > From: Feng Tang <feng.tang@xxxxxxxxx>
> > Date: Mon, 21 Nov 2022 13:17:11 +0800
> > Subject: [PATCH] mm/slub, kunit: Add a case for kmalloc redzone functionality
> >
> > kmalloc redzone check for slub has been merged, and it's better to add
> > a kunit case for it, which is inspired by a real-world case as described
> > in commit 120ee599b5bf ("staging: octeon-usb: prevent memory corruption"):
> >
> > "
> > octeon-hcd will crash the kernel when SLOB is used. This usually happens
> > after the 18-byte control transfer when a device descriptor is read.
> > The DMA engine is always transfering full 32-bit words and if the
> > transfer is shorter, some random garbage appears after the buffer.
> > The problem is not visible with SLUB since it rounds up the allocations
> > to word boundary, and the extra bytes will go undetected.
> > "
> > Suggested-by: Vlastimil Babka <vbabka@xxxxxxx>
> > Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx>
> > ---
> > lib/slub_kunit.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> > mm/slab.h | 15 +++++++++++++++
> > mm/slub.c | 4 ++--
> > 3 files changed, 59 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/slub_kunit.c b/lib/slub_kunit.c
> > index 7a0564d7cb7a..0653eed19bff 100644
> > --- a/lib/slub_kunit.c
> > +++ b/lib/slub_kunit.c
> > @@ -120,6 +120,47 @@ static void test_clobber_redzone_free(struct kunit *test)
> > kmem_cache_destroy(s);
> > }
> >
> > +
> > +/*
> > + * This case is simulating a real world case, that a device driver
> > + * requests 18 bytes buffer, but the device HW has obligation to
> > + * operate on 32 bits granularity, so it may actually read or write
> > + * 20 bytes to the buffer, and possibly pollute 2 extra bytes after
> > + * the requested space.
> > + */
> > +static void test_kmalloc_redzone_access(struct kunit *test)
> > +{
> > + u8 *p;
> > +
> > + if (!is_slub_debug_flags_enabled(SLAB_STORE_USER | SLAB_RED_ZONE))
> > + kunit_skip(test, "Test required SLAB_STORE_USER & SLAB_RED_ZONE flags on");
>
> Hrmm, this is not great. I didn't realize that we're testing kmalloc()
> specific code, so we can't simply create test-specific caches as in the
> other kunit tests.
> What if we did create a fake kmalloc cache with the necessary flags and used
> it with kmalloc_trace() instead of kmalloc()? We would be bypassing the
> kmalloc() inline layer so theoretically orig_size handling bugs could be
> introduced there that the test wouldn't catch, but I think that's rather
> unlikely. Importantly we would still be stressing the orig_size saving and
> the adjusted redzone check using this info.

Nice trick! Will go this way.

> > + p = kmalloc(18, GFP_KERNEL);
> > +
> > +#ifdef CONFIG_KFENCE
> > + {
> > + int max_retry = 10;
> > +
> > + while (is_kfence_address(p) && max_retry--) {
> > + kfree(p);
> > + p = kmalloc(18, GFP_KERNEL);
> > + }
> > +
> > + if (!max_retry)
> > + kunit_skip(test, "Fail to get non-kfenced memory");
> > + }
> > +#endif
>
> With the test-specific cache we could also pass SLAB_SKIP_KFENCE there to
> handle this.

Yep, the handling will be much simpler, thanks

>
> BTW, don't all slub kunit test need to do that in fact?

Yes, I think they also need.

With default kfence setting test, kence address wasn't hit in
250 times of boot test. And by changing CONFIG_KFENCE_NUM_OBJECTS
from 255 to 16383, and CONFIG_KFENCE_SAMPLE_INTERVAL from 100
to 5, the kfence allocation did hit once in about 300 tims of
boot test.

Will add the flag bit for all kmem_cache creation.

Thanks,
Feng

> Thanks,
> Vlastimil