Re: [PATCH RFC v4 2/3] sched: Avoid placing RT threads on cores handling long softirqs

From: Alexander Gordeev
Date: Mon Oct 17 2022 - 08:41:02 EST


On Mon, Oct 03, 2022 at 11:20:32PM +0000, John Stultz wrote:
> From: Connor O'Brien <connoro@xxxxxxxxxx>

Hi John, Connor,

I took a cursory look and have couple of hopefully meaningful
comments, but mostly - questions.

[...]

> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 55f39c8f4203..3c628db807c8 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1599,6 +1599,44 @@ static void yield_task_rt(struct rq *rq)
> #ifdef CONFIG_SMP
> static int find_lowest_rq(struct task_struct *task);
>
> +#ifdef CONFIG_RT_SOFTIRQ_OPTIMIZATION
> +#define __use_softirq_opt 1
> +/*
> + * Return whether the given cpu is currently non-preemptible
> + * while handling a potentially long softirq, or if the current
> + * task is likely to block preemptions soon because it is a
> + * ksoftirq thread that is handling slow softirq.

What is slow softirqs in this context compared to long?

> + */
> +static bool cpu_busy_with_softirqs(int cpu)
> +{
> + u32 softirqs = per_cpu(active_softirqs, cpu) |
> + __cpu_softirq_pending(cpu);
> + struct task_struct *cpu_ksoftirqd = per_cpu(ksoftirqd, cpu);
> + struct task_struct *curr;
> + struct rq *rq = cpu_rq(cpu);
> + int ret;
> +
> + rcu_read_lock();
> + curr = READ_ONCE(rq->curr); /* unlocked access */

select_task_rq_rt() takes the lock and reads curr already,
before calling this funciton. I think there is a way to
decompose it in a better way.

> + ret = (softirqs & LONG_SOFTIRQ_MASK) &&
> + (curr == cpu_ksoftirqd ||

EOL is extra.

> + preempt_count() & SOFTIRQ_MASK);

Could you please clarify this whole check in more detail?

What is the point in checking if a remote CPU is handling
a "long" softirq while the local one is handling any softirq?

> + rcu_read_unlock();

Why ret needs to be calculated under the lock?

> + return ret;
> +}
> +#else
> +#define __use_softirq_opt 0
> +static bool cpu_busy_with_softirqs(int cpu)
> +{
> + return false;
> +}
> +#endif /* CONFIG_RT_SOFTIRQ_OPTIMIZATION */
> +
> +static bool rt_task_fits_cpu(struct task_struct *p, int cpu)

To me, the new name is unfortunate, since it strips a notion
of the reason. Instead, "CPU un/fits, because of capacity" it
reads as "CPU un/fits, because of ..." what?

> +{
> + return !cpu_busy_with_softirqs(cpu) && rt_task_fits_capacity(p, cpu);

I guess the order needs to be swapped, as rt_task_fits_capacity()
is rather "quick" while cpu_busy_with_softirqs() is rather "slow".

> +}
> +
> static int
> select_task_rq_rt(struct task_struct *p, int cpu, int flags)
> {
> @@ -1894,14 +1934,17 @@ static int find_lowest_rq(struct task_struct *task)
> return -1; /* No other targets possible */
>
> /*
> - * If we're on asym system ensure we consider the different capacities
> - * of the CPUs when searching for the lowest_mask.
> + * If we're using the softirq optimization or if we are
> + * on asym system, ensure we consider the softirq processing
> + * or different capacities of the CPUs when searching for the
> + * lowest_mask.
> */
> - if (static_branch_unlikely(&sched_asym_cpucapacity)) {
> + if (__use_softirq_opt ||

Why use __use_softirq_opt and not IS_ENABLED(CONFIG_RT_SOFTIRQ_OPTIMIZATION)?

> + static_branch_unlikely(&sched_asym_cpucapacity)) {
>
> ret = cpupri_find_fitness(&task_rq(task)->rd->cpupri,
> task, lowest_mask,
> - rt_task_fits_capacity);
> + rt_task_fits_cpu);
> } else {
>
> ret = cpupri_find(&task_rq(task)->rd->cpupri,
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index c8a6913c067d..35ee79dd8786 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -60,6 +60,13 @@ static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp
>
> DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
>
> +/*
> + * active_softirqs -- per cpu, a mask of softirqs that are being handled,
> + * with the expectation that approximate answers are acceptable and therefore
> + * no synchronization.
> + */
> +DEFINE_PER_CPU(u32, active_softirqs);

I guess all active_softirqs uses need to be coupled with
IS_ENABLED(CONFIG_RT_SOFTIRQ_OPTIMIZATION) check.

> const char * const softirq_to_name[NR_SOFTIRQS] = {
> "HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "IRQ_POLL",
> "TASKLET", "SCHED", "HRTIMER", "RCU"
> @@ -551,6 +558,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> restart:
> /* Reset the pending bitmask before enabling irqs */
> set_softirq_pending(0);
> + __this_cpu_write(active_softirqs, pending);
>
> local_irq_enable();
>
> @@ -580,6 +588,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> pending >>= softirq_bit;
> }
>
> + __this_cpu_write(active_softirqs, 0);
> if (!IS_ENABLED(CONFIG_PREEMPT_RT) &&
> __this_cpu_read(ksoftirqd) == current)
> rcu_softirq_qs();

Thanks!