Re: [PATCH] checkpatch: Check for .byte-spelled insn opcodes documentation on x86

From: Joe Perches
Date: Sat Oct 10 2020 - 19:14:46 EST


On Sat, 2020-10-10 at 18:11 +0200, Borislav Petkov wrote:
> On Sat, Oct 10, 2020 at 08:27:20AM -0700, Joe Perches wrote:
> > Then this could use:
> >
> > /"\s*\.byte\s+(?:0x[0-9a-fA-F]{1,2}\s*,\s*){2,4}/
>
> Yes, this is getting close.
>
> I've tweaked it a bit to:
>
> '/\s*\.byte\s+(?:0x[0-9a-f]{1,2}[\s,]*){2,}/i'
^^^ ^
now useless without the " matches .BYTE

you probably want (?i:0x[etc...]

I'd prefer to add an upper bound to the {m,n} use.
Unbounded multiple
matches {m,} can cause perl aborts.

This regex would also match

.byte 0x020x02

(which admittedly wouldn't compile, but I've seen really
bad patches submitted too)

> which assumes at least 2 opcode bytes; upper limit can be more than 4.
> It still has some false positives in crypto but I'd say that's good
> enough. I'll play more with it later

A readability convenience would be to add and use:

our $Hex_byte = qr{(?i)0x[0-9a-f]{1,2}\b};

So if the minimum length if the isns .byte block is 2,
with a separating comma then the regex could be:

/\.byte\s+$Hex_byte\s*,\s*$Hex_byte\b/

which I think is pretty readable.

$ git grep -P '\.byte\s+(?i:0x[0-9a-f]{1,2}\s*,\s*0x[0-9a-f]{1,2})\b' -- 'arch/x86/*.[ch]'
arch/x86/include/asm/bug.h:#define ASM_UD0 ".byte 0x0f, 0xff" /* + ModRM (for Intel) */
arch/x86/include/asm/bug.h:#define ASM_UD1 ".byte 0x0f, 0xb9" /* + ModRM */
arch/x86/include/asm/bug.h:#define ASM_UD2 ".byte 0x0f, 0x0b"
arch/x86/include/asm/inst.h: .byte 0x0f, 0xc7
arch/x86/include/asm/intel_pconfig.h:#define PCONFIG ".byte 0x0f, 0x01, 0xc5"
arch/x86/include/asm/mwait.h: asm volatile(".byte 0x0f, 0x01, 0xc8;"
arch/x86/include/asm/mwait.h: asm volatile(".byte 0x0f, 0x01, 0xfa;"
arch/x86/include/asm/mwait.h: asm volatile(".byte 0x0f, 0x01, 0xc9;"
arch/x86/include/asm/mwait.h: asm volatile(".byte 0x0f, 0x01, 0xfb;"
arch/x86/include/asm/mwait.h: asm volatile("sti; .byte 0x0f, 0x01, 0xc9;"
arch/x86/include/asm/mwait.h: asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1\t\n"
arch/x86/include/asm/segment.h: ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
arch/x86/include/asm/smap.h:#define __ASM_CLAC ".byte 0x0f,0x01,0xca"
arch/x86/include/asm/smap.h:#define __ASM_STAC ".byte 0x0f,0x01,0xcb"
arch/x86/include/asm/special_insns.h: asm volatile(".byte 0x0f,0x01,0xee\n\t"
arch/x86/include/asm/special_insns.h: asm volatile(".byte 0x0f,0x01,0xef\n\t"
arch/x86/include/asm/special_insns.h: ".byte 0x66, 0x0f, 0xae, 0x30", /* clwb (%%rax) */
arch/x86/include/asm/special_insns.h: asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
arch/x86/include/asm/special_insns.h: asm volatile(".byte 0xf3, 0x0f, 0x38, 0xf8, 0x02, 0x66, 0x90"
arch/x86/include/asm/special_insns.h: asm volatile(".byte 0xf, 0x1, 0xe8" ::: "memory");