Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined

From: Masahiro Yamada
Date: Tue Oct 01 2019 - 04:27:20 EST


Hi Russell,

On Tue, Oct 1, 2019 at 2:50 AM Russell King - ARM Linux admin
<linux@xxxxxxxxxxxxxxx> wrote:
>
> On Mon, Sep 30, 2019 at 02:59:25PM +0900, Masahiro Yamada wrote:
> > KernelCI reports that bcm2835_defconfig is no longer booting since
> > commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> > forcibly"):
> >
> > https://lkml.org/lkml/2019/9/26/825
> >
> > I also received a regression report from Nicolas Saenz Julienne:
> >
> > https://lkml.org/lkml/2019/9/27/263
> >
> > This problem has cropped up on arch/arm/config/bcm2835_defconfig
> > because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> > to prefer not inlining functions with -Os. I was able to reproduce
> > it with other boards and defconfig files by manually enabling
> > CONFIG_CC_OPTIMIZE_FOR_SIZE.
> >
> > The __get_user_check() specifically uses r0, r1, r2 registers.
> > So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> > in order to avoid those registers being overwritten in the callees.
> >
> > Prior to commit 9012d011660e ("compiler: allow all arches to enable
> > CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> > inlining functions, except on x86.
> >
> > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> > So, __always_inline is now the only guaranteed way of forcible inlining.
> >
> > I want to keep as much compiler's freedom as possible about the inlining
> > decision. So, I changed the function call order instead of adding
> > __always_inline around.
> >
> > Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> > uaccess_restore() after evacuating the __e ("r0").
> >
> > Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
> > Reported-by: "kernelci.org bot" <bot@xxxxxxxxxxxx>
> > Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@xxxxxxx>
> > Signed-off-by: Masahiro Yamada <yamada.masahiro@xxxxxxxxxxxxx>
> > ---
> >
> > arch/arm/include/asm/uaccess.h | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> > index 303248e5b990..559f252d7e3c 100644
> > --- a/arch/arm/include/asm/uaccess.h
> > +++ b/arch/arm/include/asm/uaccess.h
> > @@ -191,11 +191,12 @@ extern int __get_user_64t_4(void *);
> > #define __get_user_check(x, p) \
> > ({ \
> > unsigned long __limit = current_thread_info()->addr_limit - 1; \
> > + unsigned int __ua_flags = uaccess_save_and_enable(); \
>
> If the compiler is moving uaccess_save_and_enable(), that's something
> we really don't want

Hmm, based on my poor knowledge about compilers,
I do not know if this re-arrangement happens...

> - the idea is to _minimise_ the number of kernel
> memory accesses between enabling userspace access and performing the
> actual access.
>
> Fixing it in this way widens the window for the kernel to be doing
> something it shoulding in userspace.
>
> So, the right solution is to ensure that the compiler always inlines
> the uaccess_*() helpers - which should be nothing more than four
> instructions for uaccess_save_and_enable() and two for the
> restore.
>

OK, I will use __always_inline to avoid
any potential behavior change.

Thanks.


--
Best Regards
Masahiro Yamada