Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang

From: Russell King - ARM Linux admin
Date: Mon Nov 30 2020 - 05:22:22 EST


On Mon, Nov 30, 2020 at 11:12:33AM +0100, Ard Biesheuvel wrote:
> (+ Nico)
>
> On Mon, 30 Nov 2020 at 11:11, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> >
> > On Mon, 23 Nov 2020 at 08:39, Antony Yu <swpenim@xxxxxxxxx> wrote:
> > >
> > > __do_div64 clobbers the input register r0 in little endian system.
> > > According to the inline assembly document, if an input operand is
> > > modified, it should be tied to a output operand. This patch can
> > > prevent compilers from reusing r0 register after asm statements.
> > >
> > > Signed-off-by: Antony Yu <swpenim@xxxxxxxxx>
> > > ---
> > > arch/arm/include/asm/div64.h | 5 +++--
> > > 1 file changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
> > > index 898e9c78a7e7..809efc51e90f 100644
> > > --- a/arch/arm/include/asm/div64.h
> > > +++ b/arch/arm/include/asm/div64.h
> > > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
> > > asm( __asmeq("%0", __xh)
> > > __asmeq("%1", "r2")
> > > __asmeq("%2", "r0")
> > > - __asmeq("%3", "r4")
> > > + __asmeq("%3", "r0")
> > > + __asmeq("%4", "r4")
> > > "bl __do_div64"
> > > - : "=r" (__rem), "=r" (__res)
> > > + : "=r" (__rem), "=r" (__res), "=r" (__n)
> > > : "r" (__n), "r" (__base)
> > > : "ip", "lr", "cc");
> > > *n = __res;
> > > --
> > > 2.23.0
> > >
> >
> > Agree that using r0 as an input operand only is incorrect, and not
> > only on Clang. The compiler might assume that r0 will retain its value
> > across the asm() block, which is obviously not the case.

However, you can _not_ have an asm block that names two outputs using
the same physical register - that's why both the original patch and
the posted v2 will fail.

You also can't mark r0 as clobbered because it's used as an operand
and that is not allowed by gcc.

The fact is, we have two register variables occupying the same register,
which are __n and __rem. It doesn't matter which endian-ness __rem is,
r0 will be used for both __n (input) and __rem (output).

If the compiler can't work out that if a physical register used as an
output operand will be written by the assembler, then the compiler is
quite simply buggy.

The code is correct as it stands; Clang is buggy.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!