Re: [PATCH 4/7] x86/intel_rdt: Implement scheduling support for Intel RDT

From: Peter Zijlstra
Date: Sat May 02 2015 - 14:51:38 EST


On Fri, May 01, 2015 at 06:36:38PM -0700, Vikas Shivappa wrote:
> Adds support for IA32_PQR_ASSOC MSR writes during task scheduling.
>
> The high 32 bits in the per processor MSR IA32_PQR_ASSOC represents the
> CLOSid. During context switch kernel implements this by writing the
> CLOSid of the cgroup to which the task belongs to the CPU's
> IA32_PQR_ASSOC MSR.
>
> For Cache Allocation, this would let the task fill in the cache 'subset'
> represented by the cgroup's Cache bit mask(CBM).
>

Are you guys for real? Have you even looked at the trainwreck this
makes?

> +static inline bool rdt_enabled(void)
> +{
> + return static_key_false(&rdt_enable_key);
> +}

> +/*
> + * rdt_sched_in() - Writes the task's CLOSid to IA32_PQR_MSR
> + * if the current Closid is different than the new one.
> + */
> +static inline void rdt_sched_in(struct task_struct *task)
> +{
> + struct intel_rdt *ir;
> + unsigned int clos;
> +
> + if (!rdt_enabled())
> + return;
> +
> + /*
> + * This needs to be fixed
> + * to cache the whole PQR instead of just CLOSid.
> + * PQR has closid in high 32 bits and CQM-RMID in low 10 bits.
> + * Should not write a 0 to the low 10 bits of PQR
> + * and corrupt RMID.
> + */
> + clos = this_cpu_read(x86_cpu_clos);
> +
> + rcu_read_lock();
> + ir = task_rdt(task);
> + if (ir->clos == clos) {
> + rcu_read_unlock();
> + return;
> + }
> +
> + wrmsr(MSR_IA32_PQR_ASSOC, 0, ir->clos);
> + this_cpu_write(x86_cpu_clos, ir->clos);
> + rcu_read_unlock();
> +}

You inject _ALL_ that into the scheduler hot path. Insane much?

> +
> +#else
> +
> +static inline void rdt_sched_in(struct task_struct *task) {}
> +
> #endif
> #endif
> diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h
> index 751bf4b..82ef4b3 100644
> --- a/arch/x86/include/asm/switch_to.h
> +++ b/arch/x86/include/asm/switch_to.h
> @@ -8,6 +8,9 @@ struct tss_struct;
> void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
> struct tss_struct *tss);
>
> +#include <asm/intel_rdt.h>
> +#define post_arch_switch(current) rdt_sched_in(current)
> +
> #ifdef CONFIG_X86_32
>
> #ifdef CONFIG_CC_STACKPROTECTOR

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f9123a8..cacb490 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2241,6 +2241,7 @@ static struct rq *finish_task_switch(struct task_struct *prev)
> prev_state = prev->state;
> vtime_task_switch(prev);
> finish_arch_switch(prev);
> + post_arch_switch(current);
> perf_event_task_sched_in(prev, current);
> finish_lock_switch(rq, prev);
> finish_arch_post_lock_switch();

Not a word in the Changelog on this hook; that's double fail.

Please _THINK_ before writing code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/