Re: [syzbot] [fs?] [mm?] KCSAN: data-race in __filemap_remove_folio / folio_mapping (2)

From: Dmitry Vyukov
Date: Mon Apr 24 2023 - 10:22:03 EST


On Mon, 24 Apr 2023 at 16:10, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Mon, Apr 24, 2023 at 03:49:04PM +0200, Dmitry Vyukov wrote:
> > On Mon, 24 Apr 2023 at 15:21, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > >
> > > On Mon, Apr 24, 2023 at 09:38:43AM +0200, Dmitry Vyukov wrote:
> > > > On Mon, 24 Apr 2023 at 09:19, syzbot
> > > > <syzbot+606f94dfeaaa45124c90@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > > > If I am reading this correctly, it can lead to NULL derefs in
> > > > folio_mapping() if folio->mapping is read twice. I think
> > > > folio->mapping reads/writes need to use READ/WRITE_ONCE if racy.
> > >
> > > You aren't reading it correctly.
> > >
> > > mapping = folio->mapping;
> > > if ((unsigned long)mapping & PAGE_MAPPING_FLAGS)
> > > return NULL;
> > >
> > > return mapping;
> > >
> > > The racing write is storing NULL. So it might return NULL or it might
> > > return the old mapping, or it might return NULL. Either way, the caller
> > > has to be prepared for NULL to be returned.
> > >
> > > It's a false posiive, but probably worth silencing with a READ_ONCE().
> >
> > Yes, but the end of the function does not limit effects of races. I
>
> I thought it did. I was under the impression that the compiler was not
> allowed to extract loads from within the function and move them outside.
> Maybe that changed since C99.
>
> > to this:
> >
> > if (!((unsigned long)folio->mapping & PAGE_MAPPING_FLAGS) && folio->mapping)
> > if (test_bit(AS_UNEVICTABLE, &folio->mapping->flags))
> >
> > which does crash.
>
> Yes, if the compiler is allowed to do that, then that's a possibility.

C11/C++11 simply say any data race renders behavior of the whole
program undefined. There is no discussion about values, functions,
anything else.

Before that there was no notion of data races, so it wasn't possible
to talk about possible effects and restrict them. But I don't think
there ever was an intention to do any practical restrictions around
function boundaries. That would mean that inlining can only run as the
latest optimization pass, which would inhibit tons of optimizations.
Users would throw such a compiler away.