Re: [PATCH 1/2] x86: notify hypervisor about guest entering s2idle state

From: Limonciello, Mario
Date: Mon Jun 20 2022 - 12:32:45 EST


On 6/20/2022 10:43, Grzegorz Jaszczyk wrote:
czw., 16 cze 2022 o 18:58 Limonciello, Mario
<mario.limonciello@xxxxxxx> napisał(a):

On 6/16/2022 11:48, Sean Christopherson wrote:
On Wed, Jun 15, 2022, Grzegorz Jaszczyk wrote:
pt., 10 cze 2022 o 16:30 Sean Christopherson <seanjc@xxxxxxxxxx> napisał(a):
MMIO or PIO for the actual exit, there's nothing special about hypercalls. As for
enumerating to the guest that it should do something, why not add a new ACPI_LPS0_*
function? E.g. something like

static void s2idle_hypervisor_notify(void)
{
if (lps0_dsm_func_mask > 0)
acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT_HYPERVISOR_NOTIFY
lps0_dsm_func_mask, lps0_dsm_guid);
}

Great, thank you for your suggestion! I will try this approach and
come back. Since this will be the main change in the next version,
will it be ok for you to add Suggested-by: Sean Christopherson
<seanjc@xxxxxxxxxx> tag?

If you want, but there's certainly no need to do so. But I assume you or someone
at Intel will need to get formal approval for adding another ACPI LPS0 function?
I.e. isn't there work to be done outside of the kernel before any patches can be
merged?

There are 3 different LPS0 GUIDs in use. An Intel one, an AMD (legacy)
one, and a Microsoft one. They all have their own specs, and so if this
was to be added I think all 3 need to be updated.

Yes this will not be easy to achieve I think.


As this is Linux specific hypervisor behavior, I don't know you would be
able to convince Microsoft to update theirs' either.

How about using s2idle_devops? There is a prepare() call and a
restore() call that is set for each handler. The only consumer of this
ATM I'm aware of is the amd-pmc driver, but it's done like a
notification chain so that a bunch of drivers can hook in if they need to.

Then you can have this notification path and the associated ACPI device
it calls out to be it's own driver.

Thank you for your suggestion, just to be sure that I've understand
your idea correctly:
1) it will require to extend acpi_s2idle_dev_ops about something like
hypervisor_notify() call, since existing prepare() is called from end
of acpi_s2idle_prepare_late so it is too early as it was described in
one of previous message (between acpi_s2idle_prepare_late and place
where we use hypercall there are several places where the suspend
could be canceled, otherwise we could probably try to trap on other
acpi_sleep_run_lps0_dsm occurrence from acpi_s2idle_prepare_late).


The idea for prepare() was it would be the absolute last thing before the s2idle loop was run. You're sure that's too early? It's basically the same thing as having a last stage new _DSM call.

What about adding a new abort() extension to acpi_s2idle_dev_ops? Then you could catch the cancelled suspend case still and take corrective action (if that action is different than what restore() would do).

2) using newly introduced acpi_s2idle_dev_ops hypervisor_notify() call
will allow to register handler from Intel x86/intel/pmc/core.c driver
and/or AMD x86/amd-pmc.c driver. Therefore we will need to get only
Intel and/or AMD approval about extending the ACPI LPS0 _DSM method,
correct?


Right now the only thing that hooks prepare()/restore() is the amd-pmc driver (unless Intel's PMC had a change I didn't catch yet).

I don't think you should be changing any existing drivers but rather introduce another platform driver for this specific case.

So it would be something like this:

acpi_s2idle_prepare_late
-> prepare()
--> AMD: amd_pmc handler for prepare()
--> Intel: intel_pmc handler for prepare() (conceptual)
--> HYPE0001 device: new driver's prepare() routine

So the platform driver would match the HYPE0001 device to load, and it wouldn't do anything other than provide a prepare()/restore() handler for your case.

You don't need to change any existing specs. If anything a new spec to go with this new ACPI device would be made. Someone would need to reserve the ID and such for it, but I think you can mock it up in advance.

I wonder if this will be affordable so just re-thinking loudly if
there is no other mechanism that could be suggested and used upstream
so we could notify hypervisor/vmm about guest entering s2idle state?
Especially that such _DSM function will be introduced only to trap on
some fake MMIO/PIO access and will be useful only for guest ACPI
tables?


Do you need to worry about Microsoft guests using Modern Standby too or is that out of the scope of your problem set? I think you'll be a lot more limited in how this can behave and where you can modify things if so.