Re: [PATCH v6 04/13] riscv/kprobe: Add common RVI and RVC instruction decoder code

From: Conor Dooley
Date: Thu Feb 02 2023 - 05:17:17 EST


Hey Chen, Liao, Bjorn, Heiko,

Heiko certainly has a more complete understanding of the newly added
stuff in insn*.h, but I've attempted to have a look at the insn stuff
that you have added here...

On Fri, Jan 27, 2023 at 09:05:32PM +0800, Chen Guokai wrote:
> From: Liao Chang <liaochang1@xxxxxxxxxx>
>
> These RVI and RVC instruction decoder are used in the free register
> searching algorithm, each instruction of instrumented function needs to
> decode and test if it contains a free register to form AUIPC/JALR.
>
> For RVI instruction format, the position and length of rs1/rs2/rd/opcode
> parts are uniform [1], but RVC instruction formats are complicated, so
> it addresses a series of functions to decode rs1/rs2/rd for RVC [1].
>
> [1] https://github.com/riscv/riscv-isa-manual/releases

Please make these regular link tags, so:
Link: https://github.com/riscv/riscv-isa-manual/releases [1]

> Signed-off-by: Liao Chang <liaochang1@xxxxxxxxxx>
> Co-developed-by: Chen Guokai <chenguokai17@xxxxxxxxxxxxxxxx>
> Signed-off-by: Chen Guokai <chenguokai17@xxxxxxxxxxxxxxxx>
> ---
> arch/riscv/include/asm/bug.h | 5 +-
> arch/riscv/kernel/probes/decode-insn.h | 148 +++++++++++++++++++++++
> arch/riscv/kernel/probes/simulate-insn.h | 42 +++++++
> 3 files changed, 194 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/include/asm/bug.h b/arch/riscv/include/asm/bug.h
> index 1aaea81fb141..9c33d3b58225 100644
> --- a/arch/riscv/include/asm/bug.h
> +++ b/arch/riscv/include/asm/bug.h
> @@ -19,11 +19,14 @@
> #define __BUG_INSN_32 _UL(0x00100073) /* ebreak */
> #define __BUG_INSN_16 _UL(0x9002) /* c.ebreak */
>
> +#define RVI_INSN_LEN 4UL
> +#define RVC_INSN_LEN 2UL
> +
> #define GET_INSN_LENGTH(insn) \
> ({ \
> unsigned long __len; \
> __len = ((insn & __INSN_LENGTH_MASK) == __INSN_LENGTH_32) ? \
> - 4UL : 2UL; \
> + RVI_INSN_LEN : RVC_INSN_LEN; \
> __len; \
> })
>
> diff --git a/arch/riscv/kernel/probes/decode-insn.h b/arch/riscv/kernel/probes/decode-insn.h
> index 42269a7d676d..785b023a62ea 100644
> --- a/arch/riscv/kernel/probes/decode-insn.h
> +++ b/arch/riscv/kernel/probes/decode-insn.h
> @@ -3,6 +3,7 @@
> #ifndef _RISCV_KERNEL_KPROBES_DECODE_INSN_H
> #define _RISCV_KERNEL_KPROBES_DECODE_INSN_H
>
> +#include <linux/bitops.h>
> #include <asm/sections.h>
> #include <asm/kprobes.h>
>
> @@ -15,4 +16,151 @@ enum probe_insn {
> enum probe_insn __kprobes
> riscv_probe_decode_insn(probe_opcode_t *addr, struct arch_probe_insn *asi);
>
> +#ifdef CONFIG_KPROBES
> +
> +static inline u16 rvi_rs1(kprobe_opcode_t opcode)
> +{
> + return (u16)((opcode >> 15) & 0x1f);

insn.h has a bunch of defines for this kind of thing, that have all been
reviewed. We definitely should be using those here, at the very least,
rather than having to review all of these numbers for a second time.
eg:
#define RVG_RS1_OPOFF 15

IMO, anything you need here should either be in that file, or added to
that file by this patch.

> +}
> +
> +static inline u16 rvi_rs2(kprobe_opcode_t opcode)

Also a note, these functions look really odd in their callsites:
+ if (riscv_insn_is_c_jr(insn)) {
+ READ_ON(rvc_r_rs1(insn));
+ break;
+ }

Sticking with the existing naming scheme would be great, thanks.
I think these should be moved to insn.h and renamed to:
riscv_insn_extract_rs1(), and ditto for the other things you are newly
adding here.

> +{
> + return (u16)((opcode >> 20) & 0x1f);
> +}
> +
> +static inline u16 rvi_rd(kprobe_opcode_t opcode)
> +{
> + return (u16)((opcode >> 7) & 0x1f);
> +}
> +
> +static inline s32 rvi_branch_imme(kprobe_opcode_t opcode)

RV_EXTRACT_BTYPE_IMM() already exists and provides the same capability,
no? I think the whole patch here should be moved to insn.h, reuse the
defines there and have the function names changed to match the existing,
similar functions.

> +{
> + u32 imme = 0;
> +
> + imme |= (((opcode >> 8) & 0xf) << 1) |
> + (((opcode >> 25) & 0x3f) << 5) |
> + (((opcode >> 7) & 0x1) << 11) |
> + (((opcode >> 31) & 0x1) << 12);
> +
> + return sign_extend32(imme, 13);
> +}
> +
> +static inline s32 rvi_jal_imme(kprobe_opcode_t opcode)

This is a re-implementation of riscv_insn_extract_jtype_imm() except
without the nice defines etc used there.

> +{
> + u32 imme = 0;
> +
> + imme |= (((opcode >> 21) & 0x3ff) << 1) |
> + (((opcode >> 20) & 0x1) << 11) |
> + (((opcode >> 12) & 0xff) << 12) |
> + (((opcode >> 31) & 0x1) << 20);
> +
> + return sign_extend32(imme, 21);
> +}
> +
> +#ifdef CONFIG_RISCV_ISA_C

As Bjorn pointed out, this guard can go.

> +static inline u16 rvc_r_rs1(kprobe_opcode_t opcode)
> +{
> + return (u16)((opcode >> 2) & 0x1f);

Again, defines exist for all of this stuff already that you can go and
use.
rvc_r_rs1() should be renamed to riscv_insn_extract_csstype_rs1() or
something like that to match the existing users IMO.

Also, perhaps I've missed something, but how does a shift of 2 work for
a CR format rs1? Shouldn't it be a shift of 7?

> +}
> +
> +static inline u16 rvc_r_rs2(kprobe_opcode_t opcode)
> +{
> + return (u16)((opcode >> 2) & 0x1f);
> +}

(snip)

> +static inline u16 rvc_b_rd(kprobe_opcode_t opcode)
> +{
> + return (u16)((opcode >> 7) & 0x7);
> +}

All of these are so common, that I feel you'd be very well served by
defines and some macros.

> +static inline s32 rvc_branch_imme(kprobe_opcode_t opcode)

Similar comments apply here as in the G case, in particular you can use
RVC_EXTRACT_JTYPE_IMM(), no?

> +{
> + u32 imme = 0;
> +
> + imme |= (((opcode >> 3) & 0x3) << 1) |
> + (((opcode >> 10) & 0x3) << 3) |
> + (((opcode >> 2) & 0x1) << 5) |
> + (((opcode >> 5) & 0x3) << 6) |
> + (((opcode >> 12) & 0x1) << 8);
> +
> + return sign_extend32(imme, 9);
> +}
> +
> +static inline s32 rvc_jal_imme(kprobe_opcode_t opcode)

Ditto here, but BTYPE instead?

> +{
> + u32 imme = 0;
> +
> + imme |= (((opcode >> 3) & 0x3) << 1) |
> + (((opcode >> 11) & 0x1) << 4) |
> + (((opcode >> 2) & 0x1) << 5) |
> + (((opcode >> 7) & 0x1) << 6) |
> + (((opcode >> 6) & 0x1) << 7) |
> + (((opcode >> 9) & 0x3) << 8) |
> + (((opcode >> 8) & 0x1) << 10) |
> + (((opcode >> 12) & 0x1) << 11);
> +
> + return sign_extend32(imme, 12);
> +}
> +#endif /* CONFIG_KPROBES */
> +#endif /* CONFIG_RISCV_ISA_C */
> #endif /* _RISCV_KERNEL_KPROBES_DECODE_INSN_H */
> diff --git a/arch/riscv/kernel/probes/simulate-insn.h b/arch/riscv/kernel/probes/simulate-insn.h
> index a19aaa0feb44..e89747dfabbb 100644
> --- a/arch/riscv/kernel/probes/simulate-insn.h
> +++ b/arch/riscv/kernel/probes/simulate-insn.h
> @@ -28,4 +28,46 @@ bool simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *regs);
> bool simulate_jal(u32 opcode, unsigned long addr, struct pt_regs *regs);
> bool simulate_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs);
>
> +/* RVC(S) instructions contain rs1 and rs2 */
> +__RISCV_INSN_FUNCS(c_sq, 0xe003, 0xa000);
> +__RISCV_INSN_FUNCS(c_sw, 0xe003, 0xc000);
> +__RISCV_INSN_FUNCS(c_sd, 0xe003, 0xe000);

I think all of these should move to insn.h too, and have defines to
match the existing __RISCV_INSN_FUNCS there.
Perhaps Heiko has a more nuanced opinion on this.

Thanks,
Conor.

Attachment: signature.asc
Description: PGP signature