Re: [PATCH v9 3/8] cpuset: Rebuild root domain deadline accounting information

From: Vincent Guittot
Date: Mon Dec 19 2022 - 03:07:43 EST


On Sat, 17 Dec 2022 at 00:35, Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
>
> Hi
>
> On 07/19/19 15:59, Juri Lelli wrote:
> > When the topology of root domains is modified by CPUset or CPUhotplug
> > operations information about the current deadline bandwidth held in the
> > root domain is lost.
> >
> > This patch addresses the issue by recalculating the lost deadline
> > bandwidth information by circling through the deadline tasks held in
> > CPUsets and adding their current load to the root domain they are
> > associated with.
> >
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>
> > Signed-off-by: Juri Lelli <juri.lelli@xxxxxxxxxx>
> > ---
>
> We see that rebuild_root_domain() can take 10+ ms (I get a max of 20ms quite
> consistently) on suspend/resume.
>
> Do we actually need to rebuild_root_domain() if we're going through
> a suspend/resume cycle?

During suspend to ram, there are cpus hotplug operation but If you use
suspend to idle, you will skip cpus hotplug operation and its
associated rebuild.

>
> ie: would something like the below make sense? We'd skip this logic if
> cpuhp_tasks_frozen is set which indicates it's not a real hotplug operation but
> we're suspending/resuming.
>
>
> Cheers
>
> --
> Qais Yousef
>
>
> --->8---
>
>
> From 4cfd50960ad872c5eb810ad3038eaf840bab5182 Mon Sep 17 00:00:00 2001
> From: Qais Yousef <qyousef@xxxxxxxxxxx>
> Date: Tue, 29 Nov 2022 19:01:52 +0000
> Subject: [PATCH] sched: cpuset: Don't rebuild sched domains on suspend-resume
>
> Commit f9a25f776d78 ("cpusets: Rebuild root domain deadline accounting information")
> enabled rebuilding sched domain on cpuset and hotplug operations to
> correct deadline accounting.
>
> Rebuilding sched domain is a slow operation and we see 10+ ms delays
> on suspend-resume because of that.
>
> Since nothing is expected to change on suspend-resume operation; skip
> rebuilding the sched domains to regain some of the time lost.
>
> Debugged-by: Rick Yiu <rickyiu@xxxxxxxxxx>
> Signed-off-by: Qais Yousef (Google) <qyousef@xxxxxxxxxxx>
> ---
> kernel/cgroup/cpuset.c | 6 ++++++
> kernel/sched/deadline.c | 3 +++
> 2 files changed, 9 insertions(+)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index b474289c15b8..2ff68d625b7b 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1067,6 +1067,9 @@ static void update_tasks_root_domain(struct cpuset *cs)
> struct css_task_iter it;
> struct task_struct *task;
>
> + if (cpuhp_tasks_frozen)
> + return;
> +
> css_task_iter_start(&cs->css, 0, &it);
>
> while ((task = css_task_iter_next(&it)))
> @@ -1084,6 +1087,9 @@ static void rebuild_root_domains(void)
> lockdep_assert_cpus_held();
> lockdep_assert_held(&sched_domains_mutex);
>
> + if (cpuhp_tasks_frozen)
> + return;
> +
> rcu_read_lock();
>
> /*
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 0d97d54276cc..42c1143a3956 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2575,6 +2575,9 @@ void dl_clear_root_domain(struct root_domain *rd)
> {
> unsigned long flags;
>
> + if (cpuhp_tasks_frozen)
> + return;
> +
> raw_spin_lock_irqsave(&rd->dl_bw.lock, flags);
> rd->dl_bw.total_bw = 0;
> raw_spin_unlock_irqrestore(&rd->dl_bw.lock, flags);
> --
> 2.25.1
>