Re: [PATCH v3 26/39] arm64: mte: Add in-kernel tag fault handler

From: Andrey Konovalov
Date: Fri Sep 25 2020 - 07:53:10 EST


On Fri, Sep 25, 2020 at 1:47 PM Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
>
> On Fri, Sep 25, 2020 at 01:26:02PM +0200, Andrey Konovalov wrote:
> > On Fri, Sep 25, 2020 at 12:49 PM Catalin Marinas
> > <catalin.marinas@xxxxxxx> wrote:
> > > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > > > index a3bd189602df..d110f382dacf 100644
> > > > --- a/arch/arm64/mm/fault.c
> > > > +++ b/arch/arm64/mm/fault.c
> > > > @@ -33,6 +33,7 @@
> > > > #include <asm/debug-monitors.h>
> > > > #include <asm/esr.h>
> > > > #include <asm/kprobes.h>
> > > > +#include <asm/mte.h>
> > > > #include <asm/processor.h>
> > > > #include <asm/sysreg.h>
> > > > #include <asm/system_misc.h>
> > > > @@ -294,6 +295,11 @@ static void die_kernel_fault(const char *msg, unsigned long addr,
> > > > do_exit(SIGKILL);
> > > > }
> > > >
> > > > +static void report_tag_fault(unsigned long addr, unsigned int esr,
> > > > + struct pt_regs *regs)
> > > > +{
> > > > +}
> > >
> > > Do we need to introduce report_tag_fault() in this patch? It's fine but
> > > add a note in the commit log that it will be populated in a subsequent
> > > patch.
> >
> > I did, see the last line of the commit description.
>
> Sorry, I missed that.

No problem!

> > > > +
> > > > static void __do_kernel_fault(unsigned long addr, unsigned int esr,
> > > > struct pt_regs *regs)
> > > > {
> > > > @@ -641,10 +647,40 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
> > > > return 0;
> > > > }
> > > >
> > > > +static void do_tag_recovery(unsigned long addr, unsigned int esr,
> > > > + struct pt_regs *regs)
> > > > +{
> > > > + static bool reported = false;
> > > > +
> > > > + if (!READ_ONCE(reported)) {
> > > > + report_tag_fault(addr, esr, regs);
> > > > + WRITE_ONCE(reported, true);
> > > > + }
> > >
> > > I don't mind the READ_ONCE/WRITE_ONCE here but not sure what they help
> > > with.
> >
> > The fault can happen on multiple cores at the same time, right? In
> > that case without READ/WRITE_ONCE() we'll have a data-race here.
>
> READ/WRITE_ONCE won't magically solve such races. If two CPUs enter
> simultaneously in do_tag_recovery(), they'd both read 'reported' as
> false and both print the fault info.

They won't solve the race condition, but they will solve the data
race. I guess here we don't really care about the race condition, as
printing a tag fault twice is OK. But having a data race here will
lead to KCSAN reports, although won't probably break anything in
practice.

> If you really care about this race, you need to atomically both read and
> update the variable with an xchg() or cmpxchg().