Re: [PATCH v14 21/23] x86/virt/tdx: Handle TDX interaction with ACPI S3 and deeper states

From: Rafael J. Wysocki
Date: Wed Oct 18 2023 - 06:16:21 EST


On Wed, Oct 18, 2023 at 5:22 AM Huang, Kai <kai.huang@xxxxxxxxx> wrote:
>
> Hi Rafael,
> Thanks for feedback!
> >
>
>
> > > @@ -1427,6 +1429,22 @@ static int __init tdx_init(void)
> > > return -ENODEV;
> > > }
> > >
> > > +#define HIBERNATION_MSG \
> > > + "Disable TDX due to hibernation is available. Use 'nohibernate'
> command line to disable hibernation."
> >
> > I'm not sure if this new symbol is really necessary.
> >
> > The message could be as simple as "Initialization failed: Hibernation
> > support is enabled" (assuming a properly defined pr_fmt()), because
> > that carries enough information about the reason for the failure IMO.
> >
> > How to address it can be documented elsewhere.
>
>
> The last patch of this series is the documentation patch to add TDX host. We
> can add a sentence to suggest the user to use 'nohibernate' kernel command line
> when one sees TDX gets disabled because of hibernation being available.
>
> But isn't better to just provide such information together in the dmesg so the
> user can immediately know how to resolve this issue?
>
> If user only sees "... failed: Hibernation support is enabled", then the user
> will need additional knowledge to know where to look for the solution first, and
> only after that, the user can know how to resolve this.

I would expect anyone interested in a given feature to get familiar
with its documentation in the first place. If they neglect to do that
and then find this message, it is absolutely fair to expect them to go
and look into the documentation after all.

> >
> > > + /*
> > > + * Note hibernation_available() can vary when it is called at
> > > + * runtime as it checks secretmem_active() and cxl_mem_active()
> > > + * which can both vary at runtime. But here at early_init() they
> > > + * both cannot return true, thus when hibernation_available()
> > > + * returns false here, hibernation is disabled by either
> > > + * 'nohibernate' or LOCKDOWN_HIBERNATION security lockdown,
> > > + * which are both permanent.
> > > + */
> >
> > IIUC, the role of the comment is to document the fact that it is OK to
> > use hiberation_available() here, because it cannot return "false"
> > intermittently at this point, so I would just say "At this point,
> > hibernation_available() indicates whether or not hibernation support
> > has been permanently disabled", without going into all of the details
> > (which are irrelevant IMO and may change in the future).
>
>
> Agreed. Will do. Thanks.
>
> >
> > > + if (hibernation_available()) {
> > > + pr_err("initialization failed: %s\n", HIBERNATION_MSG);
> > > + return -ENODEV;
> > > + }
> > > +
> > > err = register_memory_notifier(&tdx_memory_nb);
> > > if (err) {
> > > pr_err("initialization failed: register_memory_notifier()
> failed (%d)\n",
> > > @@ -1442,6 +1460,11 @@ static int __init tdx_init(void)
> > > return -ENODEV;
> > > }
> > >
> > > +#ifdef CONFIG_ACPI
> > > + pr_info("Disable ACPI S3 suspend. Turn off TDX in the BIOS to use
> ACPI S3.\n");
> > > + acpi_suspend_lowlevel = NULL;
> > > +#endif
> >
> > It would be somewhat nicer to have a helper for setting this pointer.
> >
>
>
> OK. Currently Xen PV dom0 also overrides the acpi_suspend_lowlevel.
>
> Do you want the helper introduced now together with this series, or it is
> acceptable to have a patch later after TDX gets merged to add a helper and
> change both Xen and TDX code to use the helper?
>
> Anyway, I suppose you mean we provide a helper in the ACPI code, and call that
> helper here in TDX code.

Yes, I do.

> Just in case you want the helper now, then I think it's better to have two
> patches to do below ?
>
> 1) A patch to introduce the helper, and change the Xen code to use it.
> 2) The current TDX patch here, but change to use the new helper to set the
> acpi_suspend_lowlevel

Yes, this makes sense to me.

> I made the incremental diff to cover above based on this patch (see below,
> compile tested only). And the TDX part change will be split out as mentioned
> above.
>
> Do you have any comments?
>
> diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
> index c8a7fc23f63c..e71bff60d647 100644
> --- a/arch/x86/include/asm/acpi.h
> +++ b/arch/x86/include/asm/acpi.h
> @@ -60,8 +60,10 @@ static inline void acpi_disable_pci(void)
> acpi_noirq_set();
> }
>
> -/* Low-level suspend routine. */
> -extern int (*acpi_suspend_lowlevel)(void);
> +typedef int (*acpi_suspend_lowlevel_t)(void);
> +
> +/* Set up low-level suspend routine. */
> +void acpi_set_suspend_lowlevel(acpi_suspend_lowlevel_t func);

I'm not sure about the typededf, but I have no strong opinion against it either.

>
> /* Physical address to resume after wakeup */
> unsigned long acpi_get_wakeup_address(void);
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 2a0ea38955df..95be371305c6 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -779,11 +779,17 @@ int (*__acpi_register_gsi)(struct device *dev, u32 gsi,
> void (*__acpi_unregister_gsi)(u32 gsi) = NULL;
>
> #ifdef CONFIG_ACPI_SLEEP
> -int (*acpi_suspend_lowlevel)(void) = x86_acpi_suspend_lowlevel;
> +static int (*acpi_suspend_lowlevel)(void) = x86_acpi_suspend_lowlevel;
> #else
> -int (*acpi_suspend_lowlevel)(void);
> +static int (*acpi_suspend_lowlevel)(void);

For the sake of consistency, either use the typedef here, or don't use
it at all.

> #endif
>
> +/* To override the default acpi_suspend_lowlevel */
> +void acpi_set_suspend_lowlevel(acpi_suspend_lowlevel_t func)
> +{
> + acpi_suspend_lowlevel = func;
> +}
> +
> /*
> * success: return IRQ number (>=0)
> * failure: return < 0
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index 38ec6815a42a..c8586bee4650 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -1565,7 +1565,7 @@ static int __init tdx_init(void)
>
> #ifdef CONFIG_ACPI
> pr_info("Disable ACPI S3 suspend. Turn off TDX in the BIOS to use ACPI
> S3.\n");
> - acpi_suspend_lowlevel = NULL;
> + acpi_set_suspend_lowlevel(NULL);
> #endif
>
> /*
> diff --git a/include/xen/acpi.h b/include/xen/acpi.h
> index b1e11863144d..81a1b6ee8fc2 100644
> --- a/include/xen/acpi.h
> +++ b/include/xen/acpi.h
> @@ -64,7 +64,7 @@ static inline void xen_acpi_sleep_register(void)
> acpi_os_set_prepare_extended_sleep(
> &xen_acpi_notify_hypervisor_extended_sleep);
>
> - acpi_suspend_lowlevel = xen_acpi_suspend_lowlevel;
> + acpi_set_suspend_lowlevel(xen_acpi_suspend_lowlevel);
> }
> }
> #else

Otherwise LGTM.