Re: [tip:locking/core] sched/cputime: Fix invalid gtime in proc

From: Frederic Weisbecker
Date: Mon Dec 07 2015 - 11:21:48 EST


On Fri, Dec 04, 2015 at 03:53:13AM -0800, tip-bot for Hiroshi Shimamoto wrote:
> Commit-ID: 2541117b0cf79977fa11a0d6e17d61010677bd7b
> Gitweb: http://git.kernel.org/tip/2541117b0cf79977fa11a0d6e17d61010677bd7b
> Author: Hiroshi Shimamoto <h-shimamoto@xxxxxxxxxxxxx>
> AuthorDate: Thu, 19 Nov 2015 16:47:28 +0100
> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> CommitDate: Fri, 4 Dec 2015 10:18:49 +0100
>
> sched/cputime: Fix invalid gtime in proc
>
> /proc/stats shows invalid gtime when the thread is running in guest.
> When vtime accounting is not enabled, we cannot get a valid delta.
> The delta is calculated with now - tsk->vtime_snap, but tsk->vtime_snap
> is only updated when vtime accounting is runtime enabled.
>
> This patch makes task_gtime() just return gtime without computing the
> buggy non-existing tickless delta when vtime accounting is not enabled.
>
> Use context_tracking_is_enabled() to check if vtime is accounting on
> some cpu, in which case only we need to check the tickless delta. This
> way we fix the gtime value regression on machines not running nohz full.
>
> The kernel config contains CONFIG_VIRT_CPU_ACCOUNTING_GEN=y and
> CONFIG_NO_HZ_FULL_ALL=n and boot without nohz_full.
>
> I ran and stop a busy loop in VM and see the gtime in host.
> Dump the 43rd field which shows the gtime in every second:
>
> # while :; do awk '{print $3" "$43}' /proc/3955/task/4014/stat; sleep 1; done
> S 4348
> R 7064566
> R 7064766
> R 7064967
> R 7065168
> S 4759
> S 4759
>
> During running busy loop, it returns large value.
>
> After applying this patch, we can see right gtime.
>
> # while :; do awk '{print $3" "$43}' /proc/10913/task/10956/stat; sleep 1; done
> S 5338
> R 5365
> R 5465
> R 5566
> R 5666
> S 5726
> S 5726
>
> Signed-off-by: Hiroshi Shimamoto <h-shimamoto@xxxxxxxxxxxxx>
> Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Cc: Chris Metcalf <cmetcalf@xxxxxxxxxx>
> Cc: Christoph Lameter <cl@xxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Luiz Capitulino <lcapitulino@xxxxxxxxxx>
> Cc: Mike Galbraith <efault@xxxxxx>
> Cc: Paul E . McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Rik van Riel <riel@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Link: http://lkml.kernel.org/r/1447948054-28668-2-git-send-email-fweisbec@xxxxxxxxx
> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
> ---

Thanks for applying the patchset!

However we may want to backport this one, it's a regression fix affecting
CONFIG_NO_HZ_FULL=y with nohz_full off (99% of many distros defaults).

Thanks.

> kernel/sched/cputime.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 26a5446..05de80b 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -788,6 +788,9 @@ cputime_t task_gtime(struct task_struct *t)
> unsigned int seq;
> cputime_t gtime;
>
> + if (!context_tracking_is_enabled())
> + return t->gtime;
> +
> do {
> seq = read_seqbegin(&t->vtime_seqlock);
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/