Re: [PATCH v6 4/4] riscv: Set unaligned access speed at compile time

From: Charlie Jenkins
Date: Wed Mar 06 2024 - 13:39:04 EST


On Wed, Mar 06, 2024 at 04:19:33PM +0000, Conor Dooley wrote:
> Hey,
>
> On Fri, Mar 01, 2024 at 05:45:35PM -0800, Charlie Jenkins wrote:
> > Introduce Kconfig options to set the kernel unaligned access support.
> > These options provide a non-portable alternative to the runtime
> > unaligned access probe.
> >
> > To support this, the unaligned access probing code is moved into it's
> > own file and gated behind a new RISCV_PROBE_UNALIGNED_ACCESS_SUPPORT
> > option.
> >
> > Signed-off-by: Charlie Jenkins <charlie@xxxxxxxxxxxx>
> > ---
> > arch/riscv/Kconfig | 58 ++++--
> > arch/riscv/include/asm/cpufeature.h | 26 +--
> > arch/riscv/kernel/Makefile | 4 +-
> > arch/riscv/kernel/cpufeature.c | 272 ----------------------------
> > arch/riscv/kernel/sys_hwprobe.c | 21 +++
> > arch/riscv/kernel/traps_misaligned.c | 2 +
> > arch/riscv/kernel/unaligned_access_speed.c | 282 +++++++++++++++++++++++++++++
> > 7 files changed, 369 insertions(+), 296 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index bffbd869a068..60b6de35599d 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -688,27 +688,61 @@ config THREAD_SIZE_ORDER
> > affects irq stack size, which is equal to thread stack size.
> >
> > config RISCV_MISALIGNED
> > - bool "Support misaligned load/store traps for kernel and userspace"
> > + bool
> > select SYSCTL_ARCH_UNALIGN_ALLOW
> > - default y
> > help
> > - Say Y here if you want the kernel to embed support for misaligned
> > - load/store for both kernel and userspace. When disable, misaligned
> > - accesses will generate SIGBUS in userspace and panic in kernel.
> > + Embed support for misaligned load/store for both kernel and userspace.
> > + When disabled, misaligned accesses will generate SIGBUS in userspace
> > + and panic in kernel.
>
> "in the kernel".
>
> > +
> > +choice
> > + prompt "Unaligned Accesses Support"
> > + default RISCV_PROBE_UNALIGNED_ACCESS
> > + help
>
> > + This selects the hardware support for unaligned accesses. This
>
> "This determines what level of support for..."
>
> > + information is used by the kernel to perform optimizations. It is also
> > + exposed to user space via the hwprobe syscall. The hardware will be
> > + probed at boot by default.
> > +
> > +config RISCV_PROBE_UNALIGNED_ACCESS
> > + bool "Probe for hardware unaligned access support"
> > + select RISCV_MISALIGNED
> > + help
> > + During boot, the kernel will run a series of tests to determine the
> > + speed of unaligned accesses. This probing will dynamically determine
> > + the speed of unaligned accesses on the boot hardware.
>
> "on the underlying system"?
>
> > The kernel will
> > + also check if unaligned memory accesses will trap into the kernel and
> > + handle such traps accordingly.
>
> I think I would phrase this to be more understandable to users. I think
> we need to explain why it would trap and what we will do. Maybe
> something like: "if unaligned memory accesses trap into the kernel as
> they are not supported by the system, the kernel will emulate the
> unaligned accesses to preserve the UABI".
>
> > +config RISCV_EMULATED_UNALIGNED_ACCESS
> > + bool "Assume the system expects emulated unaligned memory accesses"
> > + select RISCV_MISALIGNED
> > + help
> > + Assume that the system expects unaligned memory accesses to be
> > + emulated. The kernel will check if unaligned memory accesses will
> > + trap into the kernel and handle such traps accordingly.
>
> I guess the same suggestion applies here, but I think the description
> here isn't quite accurate. This option is basically the same as above,
> but without the speed test, right? It doesn't actually assume emulation
> is required at all, in fact the assumption we make is that if the
> hardware supports unaligned access that access is slow.
>
> I think I'd do:
> ```
> boot "Emulate unaligned access where system support is missing"
> help
> If unaligned accesses trap into the kernel as they are not supported
> by the system, the kernel will emulate the unaligned accesses to
> preserve the UABI. When the underlying system does support unaligned
> accesses, probing at boot is not done and unaligned accesses are
> assumed to be slow.

Great suggestions thank you. I think I will change up the second
sentence a little bit to be "When the underlying system does support
unaligned accesses, the unaligned accesses are assumed to be slow."

>
> > +config RISCV_SLOW_UNALIGNED_ACCESS
> > + bool "Assume the system supports slow unaligned memory accesses"
> > + depends on NONPORTABLE
> > + help
> > + Assume that the system supports slow unaligned memory accesses. The
> > + kernel may not be able to run at all on systems that do not support
> > + unaligned memory accesses.
>
> ...and userspace programs cannot use unaligned access either, I think
> that is worth mentioning.
>
> >
> > config RISCV_EFFICIENT_UNALIGNED_ACCESS
> > - bool "Assume the CPU supports fast unaligned memory accesses"
> > + bool "Assume the system supports fast unaligned memory accesses"
> > depends on NONPORTABLE
> > select DCACHE_WORD_ACCESS if MMU
> > select HAVE_EFFICIENT_UNALIGNED_ACCESS
> > help
> > - Say Y here if you want the kernel to assume that the CPU supports
> > - efficient unaligned memory accesses. When enabled, this option
> > - improves the performance of the kernel on such CPUs. However, the
> > - kernel will run much more slowly, or will not be able to run at all,
> > - on CPUs that do not support efficient unaligned memory accesses.
> > + Assume that the system supports fast unaligned memory accesses. When
> > + enabled, this option improves the performance of the kernel on such
> > + systems. However, the kernel will run much more slowly, or will not
> > + be able to run at all, on systems that do not support efficient
> > + unaligned memory accesses.
> >
> > - If unsure what to do here, say N.
> > +endchoice
> >
> > endmenu # "Platform type"
>
> > +#if defined(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS)
> > DECLARE_STATIC_KEY_FALSE(fast_unaligned_access_speed_key);
> >
> > static __always_inline bool has_fast_unaligned_accesses(void)
> > {
> > return static_branch_likely(&fast_unaligned_access_speed_key);
> > }
> > +#elif defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
> > +static __always_inline bool has_fast_unaligned_accesses(void)
> > +{
> > + return true;
> > +}
> > +#else
> > +static __always_inline bool has_fast_unaligned_accesses(void)
> > +{
> > + return false;
> > +}
> > +#endif
>
> These tree could just be one function with if(IS_ENABLED), whatever code
> gets made dead should be optimised out.
>

Sure, will do.

> > diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
> > index a7c56b41efd2..dad02f5faec3 100644
> > --- a/arch/riscv/kernel/sys_hwprobe.c
> > +++ b/arch/riscv/kernel/sys_hwprobe.c
> > @@ -147,8 +147,10 @@ static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext)
> > return (pair.value & ext);
> > }
> >
> > +#if defined(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS)
> > static u64 hwprobe_misaligned(const struct cpumask *cpus)
> > {
> > + return RISCV_HWPROBE_MISALIGNED_FAST;
>
> This hack is still here.

Oh no! I removed it locally but it snuck back in...

>
> > int cpu;
> > u64 perf = -1ULL;
> >
> > @@ -169,6 +171,25 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus)
> >
> > return perf;
> > }
>
> > +#elif defined(CONFIG_RISCV_EMULATED_UNALIGNED_ACCESS)
> > +static u64 hwprobe_misaligned(const struct cpumask *cpus)
> > +{
> > + if (unaligned_ctl_available())
> > + return RISCV_HWPROBE_MISALIGNED_EMULATED;
> > + else
> > + return RISCV_HWPROBE_MISALIGNED_SLOW;
> > +}
> > +#elif defined(CONFIG_RISCV_SLOW_UNALIGNED_ACCESS)
> > +static u64 hwprobe_misaligned(const struct cpumask *cpus)
> > +{
> > + return RISCV_HWPROBE_MISALIGNED_SLOW;
> > +}
> > +#elif defined(CONFIG_RISCV_EFFICIENT_UNALIGNED_ACCESS)
> > +static u64 hwprobe_misaligned(const struct cpumask *cpus)
> > +{
> > + return RISCV_HWPROBE_MISALIGNED_FAST;
> > +}
> > +#endif
>
> Same applies to these three functions.
>
> Thanks,
> Conor.

Thank you, I will send out a new version shortly.

- Charlie