Re: [PATCH v2] x86/asm: fix assembly constraints in bitops

From: Ingo Molnar
Date: Sat Apr 06 2019 - 04:21:12 EST



* Alexander Potapenko <glider@xxxxxxxxxx> wrote:

> On Fri, Apr 5, 2019 at 11:39 AM Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> >
> > * Alexander Potapenko <glider@xxxxxxxxxx> wrote:
> >
> > > 1. Use memory clobber in bitops that touch arbitrary memory
> > >
> > > Certain bit operations that read/write bits take a base pointer and an
> > > arbitrarily large offset to address the bit relative to that base.
> > > Inline assembly constraints aren't expressive enough to tell the
> > > compiler that the assembly directive is going to touch a specific memory
> > > location of unknown size, therefore we have to use the "memory" clobber
> > > to indicate that the assembly is going to access memory locations other
> > > than those listed in the inputs/outputs.
> > > To indicate that BTR/BTS instructions don't necessarily touch the first
> > > sizeof(long) bytes of the argument, we also move the address to assembly
> > > inputs.
> > >
> > > This particular change leads to size increase of 124 kernel functions in
> > > a defconfig build. For some of them the diff is in NOP operations, other
> > > end up re-reading values from memory and may potentially slow down the
> > > execution. But without these clobbers the compiler is free to cache
> > > the contents of the bitmaps and use them as if they weren't changed by
> > > the inline assembly.
> > >
> > > 2. Use byte-sized arguments for operations touching single bytes.
> > >
> > > Passing a long value to ANDB/ORB/XORB instructions makes the compiler
> > > treat sizeof(long) bytes as being clobbered, which isn't the case. This
> > > may theoretically lead to worse code in the case of heavy optimization.
> > >
> > > Signed-off-by: Alexander Potapenko <glider@xxxxxxxxxx>
> > > Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> > > Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxx>
> > > Cc: H. Peter Anvin <hpa@xxxxxxxxx>
> > > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > > Cc: James Y Knight <jyknight@xxxxxxxxxx>
> > > ---
> > > v2:
> > > -- renamed the patch
> > > -- addressed comment by Peter Zijlstra: don't use "+m" for functions
> > > returning void
> > > -- fixed input types for operations touching single bytes
> > > ---
> > > arch/x86/include/asm/bitops.h | 41 +++++++++++++++--------------------
> > > 1 file changed, 18 insertions(+), 23 deletions(-)
> >
> > I'm wondering what the primary motivation for the patch is:
> >
> > - Does it fix an actual miscompilation, or only a theoretical miscompilation?
> Depends on what we name an actual miscompilation.
> I've built a defconfig kernel and looked through some of the functions
> generated by GCC 7.3.0 with and without this clobber, and didn't spot
> any miscompilations.
> However there is a (trivial) theoretical case where this code leads to
> miscompilation: https://lkml.org/lkml/2019/3/28/393 using just GCC
> 8.3.0 with -O2.
> It isn't hard to imagine someone writes such a function in the kernel someday.
>
> So the primary motivation is to fix an existing misuse of the asm
> directive, which happens to work in certain configurations now, but
> isn't guaranteed to work under different circumstances.

Thanks, that's all the context this patch needed: the miscompilation is
real, demonstrated, and the pattern of your testcase doesn't look overly
weird.

Also the 'cost' side of your patch appears to be pretty low, the
defconfig-64 bloat from the stricter asm() constraints appears to be very
small and not measurable in .text section size with bog standard GCC 7.3:

text data bss dec hex filename sha1
19565909 4974784 1826888 26367581 192565d vmlinux.before 07f91fa36d2b477b245c7fee283dd3b7
19565909 4974784 1826888 26367581 192565d vmlinux.after 51a3f9a5fec4c29198953c06672a61a5

The allmodconfig-64 .text bloat appears to be zero as well:

text data bss dec hex filename sha1
27058207 34467402 30863436 92389045 581beb5 vmlinux.before b38c470330f38779ab0be08fd7d90053
27058207 34467402 30863436 92389045 581beb5 vmlinux.after 36c99be7cbebc13899ae22ced32fa2ec

Note that defconfig/allmodconfig is only a fraction of the kernel's
complexity, especially if we consider the myriads of build time options
in the Kconfig space plus the compiler variants out there.

Anyway, I agree that this needs fixing, so I have queued up your
constraint fixes for x86/urgent, with a -stable tag.

We'll probably not push it to Linus until next week (-rc5 time), to make
sure there are no surprises, and also to allow for any late review
feedback.

Thanks,

Ingo