Re: [PATCH v3 05/12] arm64: csum: Disable KASAN for do_csum()

From: Will Deacon
Date: Wed Apr 15 2020 - 16:10:21 EST


On Wed, Apr 15, 2020 at 08:43:05PM +0100, Will Deacon wrote:
> On Wed, Apr 15, 2020 at 08:42:16PM +0200, Arnd Bergmann wrote:
> > On Wed, Apr 15, 2020 at 7:28 PM Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > > On Wed, Apr 15, 2020 at 05:52:11PM +0100, Will Deacon wrote:
> > > > do_csum() over-reads the source buffer and therefore abuses
> > > > READ_ONCE_NOCHECK() to avoid tripping up KASAN. In preparation for
> > > > READ_ONCE_NOCHECK() becoming a macro, and therefore losing its
> > > > '__no_sanitize_address' annotation, just annotate do_csum() explicitly
> > > > and fall back to normal loads.
> > >
> > > I'm confused by this. The whole point of READ_ONCE_NOCHECK() is that it
> > > isn't checked by KASAN, so if that semantic is removed it has no reason
> > > to exist.
> > >
> > > Changing that will break the unwind/stacktrace code across multiple
> > > architectures. IIRC they use READ_ONCE_NOCHECK() for two reasons:
> > >
> > > 1. Races with concurrent modification, as might happen when a thread's
> > > stack is corrupted. Allowing the unwinder to bail out after a sanity
> > > check means the resulting report is more useful than a KASAN splat in
> > > the unwinder. I made the arm64 unwinder robust to this case.
> > >
> > > 2. I believe that the frame record itself /might/ be poisoned by KASAN,
> > > since it's not meant to be an accessible object at the C langauge
> > > level. I could be wrong about this, and would have to check.
> >
> > I thought the main reason was deadlocks when a READ_ONCE()
> > is called inside of code that is part of the KASAN handling. If
> > READ_ONCE() ends up recursively calling itself, the kernel
> > tends to crash once it overflows its stack.
>
> That was also my understanding.
>
> > > I would like to keep the unwinding robust in the first case, even if the
> > > second case doesn't apply, and I'd prefer to not mark the entirety of
> > > the unwinding code as unchecked as that's sufficiently large an subtle
> > > that it could have nasty bugs.
> > >
> > > Is there any way we keep something like READ_ONCE_NOCHECK() around even
> > > if we have to give it reduced functionality relative to READ_ONCE()?
> > >
> > > I'm not enirely sure why READ_ONCE_NOCHECK() had to go, so if there's a
> > > particular pain point I'm happy to take a look.
> >
> > As I understood, only this particular instance was removed, not all of
> > them.
>
> Right, but the problem is that whether the NOCHECK version gets checked
> or not now depends on the caller, since it's all just a macro. If we want
> to fix this, then we could force the nocheck variant to return unsigned
> long, which simplifies things a lot (completely untested):
>
>
> #define READ_ONCE(x) \
> ({ \
> compiletime_assert_rwonce_type(x); \
> __READ_ONCE_SCALAR(x); \
> })
>
> unsigned long __no_sanitise_address
> kasan_nocheck_read_once_ul(const volatile void *p)
> {
> return READ_ONCE(*p);
> }
>
> /* Please don't use this */
> #define READ_ONCE_NOCHECK(x) kasan_nocheck_read_once_ul(&x)
>

Urgh, scratch that. Trying to instantiate READ_ONCE() in compiler.h
causes a circular header-file dependency between linux/compiler.h
and asm-generic/barrier.h thanks to smp_read_barrier_depends().

Time to dust off that patch I had splitting up compiler.h.

Will