Re: Re: [PATCH V2 2/3] riscv: Add ARCH_HAS_PRETCHW support with Zibop

From: Andrew Jones
Date: Fri Jan 05 2024 - 08:31:47 EST


On Tue, Jan 02, 2024 at 11:45:08AM +0100, Andrew Jones wrote:
>
> s/Zibop/Zicbop/ <<<$SUBJECT
>
> On Sun, Dec 31, 2023 at 03:29:52AM -0500, guoren@xxxxxxxxxx wrote:
> > From: Guo Ren <guoren@xxxxxxxxxxxxxxxxx>
> >
> > Enable Linux prefetchw primitive with Zibop cpufeature, which preloads
>
> Also s/Zibop/Zicbop/ here
>
> > cache line into L1 cache for the next write operation.
> >
> > Signed-off-by: Guo Ren <guoren@xxxxxxxxxxxxxxxxx>
> > Signed-off-by: Guo Ren <guoren@xxxxxxxxxx>
> > ---
> > arch/riscv/include/asm/processor.h | 16 ++++++++++++++++
> > 1 file changed, 16 insertions(+)
> >
> > diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
> > index f19f861cda54..8d3a2ab37678 100644
> > --- a/arch/riscv/include/asm/processor.h
> > +++ b/arch/riscv/include/asm/processor.h
> > @@ -13,6 +13,9 @@
> > #include <vdso/processor.h>
> >
> > #include <asm/ptrace.h>
> > +#include <asm/insn-def.h>
> > +#include <asm/alternative-macros.h>
> > +#include <asm/hwcap.h>
> >
> > #ifdef CONFIG_64BIT
> > #define DEFAULT_MAP_WINDOW (UL(1) << (MMAP_VA_BITS - 1))
> > @@ -106,6 +109,19 @@ static inline void arch_thread_struct_whitelist(unsigned long *offset,
> > #define KSTK_EIP(tsk) (task_pt_regs(tsk)->epc)
> > #define KSTK_ESP(tsk) (task_pt_regs(tsk)->sp)
> >
> > +#ifdef CONFIG_RISCV_ISA_ZICBOP
> > +#define ARCH_HAS_PREFETCHW
> > +
> > +#define PREFETCHW_ASM(x) \
> > + ALTERNATIVE(__nops(1), CBO_PREFETCH_W(x, 0), 0, \
> > + RISCV_ISA_EXT_ZICBOP, CONFIG_RISCV_ISA_ZICBOP)
> > +
> > +
> > +static inline void prefetchw(const void *x)
> > +{
> > + __asm__ __volatile__(PREFETCHW_ASM(%0) : : "r" (x) : "memory");
> > +}
>
> Shouldn't we create an interface which exposes the offset input of
> the instruction, allowing a sequence of calls to be unrolled? But
> I guess that could be put off until there's a need for it.

If we did expose offset, then, because it must be constant and also must
only have bits 5-11 set, then we could add a static assert. Something like

#define prefetchw_offset(base, offset) \
({ \
static_assert(__builtin_constant_p(offset) && !(offset & ~GENMASK(11, 5))); \
__asm__ __volatile__(PREFETCHW_ASM(%0, %1) : : "r" (x), "I" (offset) : "memory"); \
})

Probably overkill though...

Thanks,
drew