Re: [PATCH v2] powerpc: fix boot on BOOK3S_32 with CONFIG_STRICT_KERNEL_RWX

From: Balbir Singh
Date: Sat Nov 25 2017 - 19:08:12 EST


On Thu, Nov 23, 2017 at 11:04 PM, Michael Ellerman <mpe@xxxxxxxxxxxxxx> wrote:
> Christophe LEROY <christophe.leroy@xxxxxx> writes:
>> Le 22/11/2017 Ã 12:48, Michael Ellerman a Ãcrit :
>>> Christophe LEROY <christophe.leroy@xxxxxx> writes:
>>>> Le 22/11/2017 Ã 00:07, Balbir Singh a Ãcrit :
>>>>> On Wed, Nov 22, 2017 at 1:28 AM, Christophe Leroy
>>>>> <christophe.leroy@xxxxxx> wrote:
>>>>>> On powerpc32, patch_instruction() is called by apply_feature_fixups()
>>>>>> which is called from early_init()
> ...
>>>>>> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
>>>>>> index c9de03e0c1f1..d469224c4ada 100644
>>>>>> --- a/arch/powerpc/lib/code-patching.c
>>>>>> +++ b/arch/powerpc/lib/code-patching.c
>>>>>> @@ -146,11 +147,8 @@ int patch_instruction(unsigned int *addr, unsigned int instr)
>>>>>> * During early early boot patch_instruction is called
>>>>>> * when text_poke_area is not ready, but we still need
>>>>>> * to allow patching. We just do the plain old patching
>>>>>> - * We use slab_is_available and per cpu read * via this_cpu_read
>>>>>> - * of text_poke_area. Per-CPU areas might not be up early
>>>>>> - * this can create problems with just using this_cpu_read()
>>>>>> */
>>>>>> - if (!slab_is_available() || !this_cpu_read(text_poke_area))
>>>>>> + if (!this_cpu_read(*PTRRELOC(&text_poke_area)))
>>>>>> return __patch_instruction(addr, instr);
>>>>>
>>>>> On ppc64, we call apply_feature_fixups() in early_setup() after we've
>>>>> relocated ourselves. Sorry for missing the ppc32 case. I would like to
>>>>> avoid PTRRELOC when unnecessary.
>>>>
>>>> What do you suggest then ?
>>>>
>>>> Some #ifdef PPC32 around that ?
>>>
>>> No I don't think that improves anything.
>>>
>>> I think the comment about per-cpu not being up is wrong, you'll just get
>>> the static version of text_poke_area, which should be NULL. So we don't
>>> need the slab_available() check anyway.
>>>
>>> So I'll take this as-is.
>>>
>>> Having said that I absolutely hate PTRRELOC, so if it starts spreading
>>> we will have to come up with something less bug prone.
>>
>> Would something like that be the solution ?
>
> I don't love that actual patch, there's a lot of churn just for one
> flag.
>
> But the idea is not so bad.
>
> In fact I don't think we ever need to use the text_poke_area when we
> call do_feature_fixups().
>
> Most of the calls are in early boot.
>
> The exception is for modules, but when we do the fixups *of the module*,
> the module text is not mapped read only yet.
>
> So I think we can just do something like below.
>
> cheers
>
>
> diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
> index abef812de7f8..1090024e8519 100644
> --- a/arch/powerpc/include/asm/code-patching.h
> +++ b/arch/powerpc/include/asm/code-patching.h
> @@ -31,6 +31,7 @@ unsigned int create_cond_branch(const unsigned int *addr,
> unsigned long target, int flags);
> int patch_branch(unsigned int *addr, unsigned long target, int flags);
> int patch_instruction(unsigned int *addr, unsigned int instr);
> +int raw_patch_instruction(unsigned int *addr, unsigned int instr);
>
> int instr_is_relative_branch(unsigned int instr);
> int instr_is_branch_to_addr(const unsigned int *instr, unsigned long addr);
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index d469224c4ada..d1eb24cbef58 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -23,7 +23,7 @@
> #include <asm/code-patching.h>
> #include <asm/setup.h>
>
> -static int __patch_instruction(unsigned int *addr, unsigned int instr)
> +int raw_patch_instruction(unsigned int *addr, unsigned int instr)
> {
> int err;
>
> @@ -148,8 +148,8 @@ int patch_instruction(unsigned int *addr, unsigned int instr)
> * when text_poke_area is not ready, but we still need
> * to allow patching. We just do the plain old patching
> */
> - if (!this_cpu_read(*PTRRELOC(&text_poke_area)))
> - return __patch_instruction(addr, instr);
> + if (!this_cpu_read(text_poke_area))
> + return raw_patch_instruction(addr, instr);
>
> local_irq_save(flags);
>
> @@ -184,7 +184,7 @@ int patch_instruction(unsigned int *addr, unsigned int instr)
>
> int patch_instruction(unsigned int *addr, unsigned int instr)
> {
> - return __patch_instruction(addr, instr);
> + return raw_patch_instruction(addr, instr);
> }
>
> #endif /* CONFIG_STRICT_KERNEL_RWX */
> diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
> index 41cf5ae273cf..0872d60ede10 100644
> --- a/arch/powerpc/lib/feature-fixups.c
> +++ b/arch/powerpc/lib/feature-fixups.c
> @@ -62,7 +62,7 @@ static int patch_alt_instruction(unsigned int *src, unsigned int *dest,
> }
> }
>
> - patch_instruction(dest, instr);
> + raw_patch_instruction(dest, instr);
>
> return 0;
> }
> @@ -91,7 +91,7 @@ static int patch_feature_section(unsigned long value, struct fixup_entry *fcur)
> }
>
> for (; dest < end; dest++)
> - patch_instruction(dest, PPC_INST_NOP);
> + raw_patch_instruction(dest, PPC_INST_NOP);
>
> return 0;
> }
> @@ -129,7 +129,7 @@ void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end)
>
> for (; start < end; start++) {
> dest = (void *)start + *start;
> - patch_instruction(dest, PPC_INST_LWSYNC);
> + raw_patch_instruction(dest, PPC_INST_LWSYNC);
> }
> }
>
> @@ -147,7 +147,7 @@ static void do_final_fixups(void)
> length = (__end_interrupts - _stext) / sizeof(int);
>
> while (length--) {
> - patch_instruction(dest, *src);
> + raw_patch_instruction(dest, *src);
> src++;
> dest++;
> }

This looks more promising, but there is a subtle dependence between
marking areas as R/O/X and the raw_patch_ins* bits

I saw that Michael has merged that patch as is, I guess we get to
continue to optimise :)

Balbir