Re: [syzbot] KASAN: invalid-access Read in copy_page

From: Catalin Marinas
Date: Tue Sep 06 2022 - 11:19:41 EST


On Tue, Sep 06, 2022 at 03:40:59PM +0200, Dmitry Vyukov wrote:
> On Tue, 6 Sept 2022 at 15:24, Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
> > On Mon, Sep 05, 2022 at 11:39:24PM +0200, Andrey Konovalov wrote:
> > > Syzbot reported an issue with MTE tagging of user pages, see the report below.
> > >
> > > Possibly, it's related to your "mm: kasan: Skip unpoisoning of user
> > > pages" series. However, I'm not sure what the issue is.
> > [...]
> > > On Sat, Aug 6, 2022 at 3:31 AM syzbot
> > > <syzbot+c2c79c6d6eddc5262b77@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > > > BUG: KASAN: invalid-access in copy_page+0x10/0xd0 arch/arm64/lib/copy_page.S:26
> > > > Read at addr f5ff000017f2e000 by task syz-executor.1/2218
> > > > Pointer tag: [f5], memory tag: [f2]
> > [...]
> > > > The buggy address belongs to the physical page:
> > > > page:000000003e6672be refcount:3 mapcount:2 mapping:0000000000000000 index:0xffffffffe pfn:0x57f2e
> > > > memcg:fbff00001ded8000
> > > > anon flags: 0x1ffc2800208001c(uptodate|dirty|lru|swapbacked|arch_2|node=0|zone=0|lastcpupid=0x7ff|kasantag=0xa)
> >
> > It looks like a copy-on-write where the source page is tagged
> > (PG_mte_tagged set) but page_kasan_tag() != 0xff (kasantag == 0xa). The
> > page is also swap-backed. Our current assumption is that
> > page_kasan_tag_reset() should be called on page allocation and we should
> > never end up with a user page without the kasan tag reset.
[...]
> > Does it take long to reproduce this kasan warning?
>
> syzbot finds several such cases every day (200 crashes for the past 35 days):
> https://syzkaller.appspot.com/bug?extid=c2c79c6d6eddc5262b77
> So once it reaches the tested tree, we should have an answer within a day.

That's good to know. BTW, does syzkaller write tags in mmap'ed pages or
only issues random syscalls? I'm trying to figure out whether tag 0xf2
was written by the kernel without updating the corresponding
page_kasan_tag() or it was syzkaller recolouring the page.

--
Catalin