Re: [patch 2/3] x86/process: Optimize TIF_BLOCKSTEP switch

From: Andy Lutomirski
Date: Thu Dec 15 2016 - 12:29:35 EST


On Thu, Dec 15, 2016 at 8:44 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> Provide and use a seperate helper for toggling the DEBUGCTLMSR_BTF bit
> instead of doing it open coded with a branch and eventually evaluating
> boot_cpu_data twice.
>
> x86_64:
> 3694 8505 16 12215 2fb7 Before
> 3662 8505 16 12183 2f97 After
>
> i386:
> 5986 9388 1804 17178 431a Before
> 5906 9388 1804 17098 42ca After
>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> ---
> arch/x86/include/asm/processor.h | 12 ++++++++++++
> arch/x86/kernel/process.c | 10 ++--------
> 2 files changed, 14 insertions(+), 8 deletions(-)
>
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -676,6 +676,18 @@ static inline void update_debugctlmsr(un
> wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr);
> }
>
> +static inline void toggle_debugctlmsr(unsigned long mask)
> +{
> + unsigned long msrval;
> +
> +#ifndef CONFIG_X86_DEBUGCTLMSR
> + if (boot_cpu_data.x86 < 6)
> + return;
> +#endif
> + rdmsrl(MSR_IA32_DEBUGCTLMSR, msrval);
> + wrmsrl(MSR_IA32_DEBUGCTLMSR, msrval ^ mask);
> +}
> +

This scares me. If the MSR ever gets out of sync with the TI flag,
this will malfunction. And IIRC the MSR is highly magical and the CPU
clears it all by itself under a variety of not-so-well documented
circumstances.

How about adding a real feature bit and doing:

if (!static_cpu_has(X86_FEATURE_BLOCKSTEP))
return;

rdmsrl(MSR_IA32_DEBUGCTLMSR, msrval);
msrval &= DEBUGCTLMSR_BTF;
msrval |= (tifn >> TIF_BLOCKSTEP) << DEBUGCTLMSR_BIT;

--Andy