Re: [PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y

From: Will Deacon
Date: Wed Jul 01 2020 - 06:19:47 EST


On Tue, Jun 30, 2020 at 09:25:03PM +0200, Arnd Bergmann wrote:
> On Tue, Jun 30, 2020 at 7:39 PM Will Deacon <will@xxxxxxxxxx> wrote:
> > +#define __READ_ONCE(x) \
> > +({ \
> > + int atomic = 1; \
> > + union { __unqual_scalar_typeof(x) __val; char __c[1]; } __u; \
> > + typeof(&(x)) __x = &(x); \
> > + switch (sizeof(x)) { \
> ...
> > + atomic ? (typeof(x))__u.__val : (*(volatile typeof(x) *)__x); \
> > +})
>
> This expands (x) nine times (five in __unqual_scala_typeof()), which can
> lead to significant code bloat after preprocessing if something passes a
> compound expression into READ_ONCE().
> The compiler works it out eventually, but we've seen an actual slowdown
> in compile speed from this recently, especially on clang.
>
> I think if you move the
>
> typeof(&(x)) __x = &(x);
>
> line first, all other instances can use typeof(*__x) instead of typeof(x)
> and avoid this problem.

Cheers, I was only thinking about side-effects when I wrote this, but
bloating built time is very unpopular, so I'll go with your suggestion.

> Once we make gcc-4.9 the minimum version,
> this could be further improved to
>
> __auto_type __x = &(x);

Is anybody working on moving to 4.9? I've seen the mails from Linus
championing it, but I thought there was a RHEL in support that people
might care about?

Will