Re: [PATCH] kasan: fix the missing underflow in memmove and memcpy with CONFIG_KASAN_GENERIC=y

From: Walter Wu
Date: Mon Oct 07 2019 - 08:03:34 EST


On Mon, 2019-10-07 at 12:51 +0200, Dmitry Vyukov wrote:
> On Mon, Oct 7, 2019 at 11:50 AM Walter Wu <walter-zh.wu@xxxxxxxxxxxx> wrote:
> >
> > On Mon, 2019-10-07 at 17:28 +0800, Walter Wu wrote:
> > > On Mon, 2019-10-07 at 11:10 +0200, Dmitry Vyukov wrote:
> > > > On Mon, Oct 7, 2019 at 11:03 AM Walter Wu <walter-zh.wu@xxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Mon, 2019-10-07 at 10:54 +0200, Dmitry Vyukov wrote:
> > > > > > On Mon, Oct 7, 2019 at 10:52 AM Walter Wu <walter-zh.wu@xxxxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On Mon, 2019-10-07 at 10:24 +0200, Dmitry Vyukov wrote:
> > > > > > > > On Mon, Oct 7, 2019 at 10:18 AM Walter Wu <walter-zh.wu@xxxxxxxxxxxx> wrote:
> > > > > > > > > The patchsets help to produce KASAN report when size is negative numbers
> > > > > > > > > in memory operation function. It is helpful for programmer to solve the
> > > > > > > > > undefined behavior issue. Patch 1 based on Dmitry's review and
> > > > > > > > > suggestion, patch 2 is a test in order to verify the patch 1.
> > > > > > > > >
> > > > > > > > > [1]https://bugzilla.kernel.org/show_bug.cgi?id=199341
> > > > > > > > > [2]https://lore.kernel.org/linux-arm-kernel/20190927034338.15813-1-walter-zh.wu@xxxxxxxxxxxx/
> > > > > > > > >
> > > > > > > > > Walter Wu (2):
> > > > > > > > > kasan: detect invalid size in memory operation function
> > > > > > > > > kasan: add test for invalid size in memmove
> > > > > > > > >
> > > > > > > > > lib/test_kasan.c | 18 ++++++++++++++++++
> > > > > > > > > mm/kasan/common.c | 13 ++++++++-----
> > > > > > > > > mm/kasan/generic.c | 5 +++++
> > > > > > > > > mm/kasan/generic_report.c | 12 ++++++++++++
> > > > > > > > > mm/kasan/tags.c | 5 +++++
> > > > > > > > > mm/kasan/tags_report.c | 12 ++++++++++++
> > > > > > > > > 6 files changed, 60 insertions(+), 5 deletions(-)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > commit 5b3b68660b3d420fd2bd792f2d9fd3ccb8877ef7
> > > > > > > > > Author: Walter-zh Wu <walter-zh.wu@xxxxxxxxxxxx>
> > > > > > > > > Date: Fri Oct 4 18:38:31 2019 +0800
> > > > > > > > >
> > > > > > > > > kasan: detect invalid size in memory operation function
> > > > > > > > >
> > > > > > > > > It is an undefined behavior to pass a negative numbers to
> > > > > > > > > memset()/memcpy()/memmove()
> > > > > > > > > , so need to be detected by KASAN.
> > > > > > > > >
> > > > > > > > > If size is negative numbers, then it has two reasons to be defined
> > > > > > > > > as out-of-bounds bug type.
> > > > > > > > > 1) Casting negative numbers to size_t would indeed turn up as a
> > > > > > > > > large
> > > > > > > > > size_t and its value will be larger than ULONG_MAX/2, so that this
> > > > > > > > > can
> > > > > > > > > qualify as out-of-bounds.
> > > > > > > > > 2) Don't generate new bug type in order to prevent duplicate reports
> > > > > > > > > by
> > > > > > > > > some systems, e.g. syzbot.
> > > > > > > > >
> > > > > > > > > KASAN report:
> > > > > > > > >
> > > > > > > > > BUG: KASAN: out-of-bounds in kmalloc_memmove_invalid_size+0x70/0xa0
> > > > > > > > > Read of size 18446744073709551608 at addr ffffff8069660904 by task
> > > > > > > > > cat/72
> > > > > > > > >
> > > > > > > > > CPU: 2 PID: 72 Comm: cat Not tainted
> > > > > > > > > 5.4.0-rc1-next-20191004ajb-00001-gdb8af2f372b2-dirty #1
> > > > > > > > > Hardware name: linux,dummy-virt (DT)
> > > > > > > > > Call trace:
> > > > > > > > > dump_backtrace+0x0/0x288
> > > > > > > > > show_stack+0x14/0x20
> > > > > > > > > dump_stack+0x10c/0x164
> > > > > > > > > print_address_description.isra.9+0x68/0x378
> > > > > > > > > __kasan_report+0x164/0x1a0
> > > > > > > > > kasan_report+0xc/0x18
> > > > > > > > > check_memory_region+0x174/0x1d0
> > > > > > > > > memmove+0x34/0x88
> > > > > > > > > kmalloc_memmove_invalid_size+0x70/0xa0
> > > > > > > > >
> > > > > > > > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=199341
> > > > > > > > >
> > > > > > > > > Signed-off-by: Walter Wu <walter-zh.wu@xxxxxxxxxxxx>
> > > > > > > > > Reported -by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> > > > > > > > > Suggested-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> > > > > > > > >
> > > > > > > > > diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> > > > > > > > > index 6814d6d6a023..6ef0abd27f06 100644
> > > > > > > > > --- a/mm/kasan/common.c
> > > > > > > > > +++ b/mm/kasan/common.c
> > > > > > > > > @@ -102,7 +102,8 @@ EXPORT_SYMBOL(__kasan_check_write);
> > > > > > > > > #undef memset
> > > > > > > > > void *memset(void *addr, int c, size_t len)
> > > > > > > > > {
> > > > > > > > > - check_memory_region((unsigned long)addr, len, true, _RET_IP_);
> > > > > > > > > + if (!check_memory_region((unsigned long)addr, len, true, _RET_IP_))
> > > > > > > > > + return NULL;
> > > > > > > > >
> > > > > > > > > return __memset(addr, c, len);
> > > > > > > > > }
> > > > > > > > > @@ -110,8 +111,9 @@ void *memset(void *addr, int c, size_t len)
> > > > > > > > > #undef memmove
> > > > > > > > > void *memmove(void *dest, const void *src, size_t len)
> > > > > > > > > {
> > > > > > > > > - check_memory_region((unsigned long)src, len, false, _RET_IP_);
> > > > > > > > > - check_memory_region((unsigned long)dest, len, true, _RET_IP_);
> > > > > > > > > + if (!check_memory_region((unsigned long)src, len, false, _RET_IP_) ||
> > > > > > > > > + !check_memory_region((unsigned long)dest, len, true, _RET_IP_))
> > > > > > > > > + return NULL;
> > > > > > > > >
> > > > > > > > > return __memmove(dest, src, len);
> > > > > > > > > }
> > > > > > > > > @@ -119,8 +121,9 @@ void *memmove(void *dest, const void *src, size_t
> > > > > > > > > len)
> > > > > > > > > #undef memcpy
> > > > > > > > > void *memcpy(void *dest, const void *src, size_t len)
> > > > > > > > > {
> > > > > > > > > - check_memory_region((unsigned long)src, len, false, _RET_IP_);
> > > > > > > > > - check_memory_region((unsigned long)dest, len, true, _RET_IP_);
> > > > > > > > > + if (!check_memory_region((unsigned long)src, len, false, _RET_IP_) ||
> > > > > > > > > + !check_memory_region((unsigned long)dest, len, true, _RET_IP_))
> > > > > > > > > + return NULL;
> > > > > > > > >
> > > > > > > > > return __memcpy(dest, src, len);
> > > > > > > > > }
> > > > > > > > > diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> > > > > > > > > index 616f9dd82d12..02148a317d27 100644
> > > > > > > > > --- a/mm/kasan/generic.c
> > > > > > > > > +++ b/mm/kasan/generic.c
> > > > > > > > > @@ -173,6 +173,11 @@ static __always_inline bool
> > > > > > > > > check_memory_region_inline(unsigned long addr,
> > > > > > > > > if (unlikely(size == 0))
> > > > > > > > > return true;
> > > > > > > > >
> > > > > > > > > + if (unlikely((long)size < 0)) {
> > > > > > > > > + kasan_report(addr, size, write, ret_ip);
> > > > > > > > > + return false;
> > > > > > > > > + }
> > > > > > > > > +
> > > > > > > > > if (unlikely((void *)addr <
> > > > > > > > > kasan_shadow_to_mem((void *)KASAN_SHADOW_START))) {
> > > > > > > > > kasan_report(addr, size, write, ret_ip);
> > > > > > > > > diff --git a/mm/kasan/generic_report.c b/mm/kasan/generic_report.c
> > > > > > > > > index 36c645939bc9..ed0eb94cb811 100644
> > > > > > > > > --- a/mm/kasan/generic_report.c
> > > > > > > > > +++ b/mm/kasan/generic_report.c
> > > > > > > > > @@ -107,6 +107,18 @@ static const char *get_wild_bug_type(struct
> > > > > > > > > kasan_access_info *info)
> > > > > > > > >
> > > > > > > > > const char *get_bug_type(struct kasan_access_info *info)
> > > > > > > > > {
> > > > > > > > > + /*
> > > > > > > > > + * If access_size is negative numbers, then it has two reasons
> > > > > > > > > + * to be defined as out-of-bounds bug type.
> > > > > > > > > + * 1) Casting negative numbers to size_t would indeed turn up as
> > > > > > > > > + * a 'large' size_t and its value will be larger than ULONG_MAX/2,
> > > > > > > > > + * so that this can qualify as out-of-bounds.
> > > > > > > > > + * 2) Don't generate new bug type in order to prevent duplicate
> > > > > > > > > reports
> > > > > > > > > + * by some systems, e.g. syzbot.
> > > > > > > > > + */
> > > > > > > > > + if ((long)info->access_size < 0)
> > > > > > > > > + return "out-of-bounds";
> > > > > > > >
> > > > > > > > "out-of-bounds" is the _least_ frequent KASAN bug type. It won't
> > > > > > > > prevent duplicates. "heap-out-of-bounds" is the frequent one.
> > > > > > >
> > > > > > >
> > > > > > > /*
> > > > > > > * If access_size is negative numbers, then it has two reasons
> > > > > > > * to be defined as out-of-bounds bug type.
> > > > > > > * 1) Casting negative numbers to size_t would indeed turn up as
> > > > > > > * a "large" size_t and its value will be larger than ULONG_MAX/2,
> > > > > > > * so that this can qualify as out-of-bounds.
> > > > > > > * 2) Don't generate new bug type in order to prevent duplicate
> > > > > > > reports
> > > > > > > * by some systems, e.g. syzbot. "out-of-bounds" is the _least_
> > > > > > > frequent KASAN bug type.
> > > > > > > * It won't prevent duplicates. "heap-out-of-bounds" is the
> > > > > > > frequent one.
> > > > > > > */
> > > > > > >
> > > > > > > We directly add it into the comment.
> > > > > >
> > > > > >
> > > > > > OK, let's start from the beginning: why do you return "out-of-bounds" here?
> > > > > >
> > > > > Uh, comment 1 and 2 should explain it. :)
> > > >
> > > > The comment says it will cause duplicate reports. It does not explain
> > > > why you want syzbot to produce duplicate reports and spam kernel
> > > > developers... So why do you want that?
> > > >
> > > We don't generate new bug type in order to prevent duplicate by some
> > > systems, e.g. syzbot. Is it right? If yes, then it should not have
> > > duplicate report.
> > >
> > Sorry, because we don't generate new bug type. it should be duplicate
> > report(only one report which may be oob or size invlid),
> > the duplicate report goal is that invalid size is oob issue, too.
> >
> > I would not introduce a new bug type.
> > These are parsed and used by some systems, e.g. syzbot. If size is
> > user-controllable, then a new bug type for this will mean 2 bug
> > reports.
>
> To prevent duplicates, the new crash title must not just match _any_
> crash title that kernel can potentially produce. It must match exactly
> the crash that kernel produces for this bug on other input data.
>
> Consider, userspace passes size=123, KASAN produces "heap-out-of-bounds in foo".
> Now userspace passes size=-1 and KASAN produces "invalid-size in foo".
> This will be a duplicate bug report.
> Now if KASAN will produce "out-of-bounds in foo", it will also lead to
> a duplicate report.
> Only iff KASAN will produce "heap-out-of-bounds in foo" for size=-1,
> it will not lead to a duplicate report.

I think it is not easy to avoid the duplicate report(mentioned above).
As far as my knowledge is concerned, KASAN is memory corruption detector
in kernel space, it should only detect memory corruption and don't
distinguish whether it is passed by userspace. if we want to do, then we
may need to parse backtrace to check if it has copy_form_user() or other
function?