Re: [PATCH v7] PM: sleep: Expose last succeeded resumed timestamp in sysfs

From: Rafael J. Wysocki
Date: Thu Jan 25 2024 - 15:19:31 EST


On Thu, Jan 25, 2024 at 1:43 AM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
>
> On Mon, 22 Jan 2024 18:08:22 -0800
> Brian Norris <briannorris@xxxxxxxxxxxx> wrote:
>
> > On Fri, Jan 19, 2024 at 1:08 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > > On Wed, Jan 17, 2024 at 1:07 AM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> > > >
> > > > Gently ping,
> > > >
> > > > I would like to know this is enough or I should add more info/update.
> > >
> > > I still am not sure what this is going to be useful for.
> > >
> > > Do you have a specific example?
> >
> > Since there seems to be some communication gap here, I'll give it a try.
> >
> > First, I'll paste the key phrase of its use case from the cover letter:
> >
> > "we would like to know how long the resume processes are taken in kernel
> > and in user-space"
> >
> > This is a "system measurement" question, for use in tests (e.g., in a
> > test lab for CI or for pre-release testing, where we suspend
> > Chromebooks, wake them back up, and measure how long the wakeup took)
> > or for user-reported metrics (e.g., similar statistics from real
> > users' systems, if they've agreed to automatically report usage
> > statistics, back to Google). We'd like to know how long it takes for a
> > system to wake up, so we can detect when there are problems that lead
> > to a slow system-resume experience. The user experience includes both
> > time spent in the kernel and time spent after user space has thawed
> > (and is spending time in potentially complex power and display manager
> > stacks) before a Chromebook's display lights back up.
>
> Thanks Brian for explaining, this is correctly explained how we are
> using this for measuring resume process duration.
>
> > If I understand the whole of Masami's work correctly, I believe we're
> > taking "timestamps parsed out of dmesg" (or potentially out of ftrace,
> > trace events, etc.) to measure the kernel side, plus "timestamp
> > provided here in CLOCK_MONOTONIC" and "timestamp determined in our
> > power/display managers" to measure user space.
>
> Yes, I decided to decouple the kernel and user space because the clock
> subsystem is adjusted when resuming. So for the kernel, we will use
> local clock (which is not exposed to user space), and use CLOCK_MONOTONIC
> for the user space.

The problem with this split is that you cannot know how much time
elapses between the "successful kernel resume time" and the time when
user space gets to resume.

As of this patch, the kernel timestamp is taken when the kernel is
about to thaw user space and some user space tasks may start running
right away.

Some other tasks, however, will wait for what happens next in the
kernel (because it is not done with resuming yet) and some of them
will wait until explicitly asked to resume by the resume process IIUC.

Your results depend on which tasks participate in the "user
experience", so to speak. If they are the tasks that wait to be
kicked by the resume process, the kernel timestamp taken as per the
above is useless for them, because there is quite some stuff that
happens in the kernel before they will get kicked.

Moreover, some tasks will wait for certain device drivers to get ready
after the rest of the system resumes and that may still take some more
time after the kernel has returned to the process driving the system
suspend-resume.

I'm not sure if there is a single point which can be used as a "user
space resume start" time for every task, which is why I'm not
convinced about this patch.

BTW, there is a utility called sleepgraph that measures the kernel
part of the system suspend-resume. It does its best to measure it
very precisely and uses different techniques for that. Also, it is
included in the kernel source tree. Can you please have a look at it
and see how much there is in common between it and your tools? Maybe
there are some interfaces that can be used in common, or maybe it
could benefit from some interfaces that you are planning to add.