Re: [PATCH v2] x86/kexec: Add EFI config table identity mapping for kexec kernel

From: Tao Liu
Date: Thu Jul 06 2023 - 23:40:35 EST


Hi Borislav,

Thanks for the patch review!

On Thu, Jul 6, 2023 at 1:34 AM Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Thu, Jun 01, 2023 at 03:20:44PM +0800, Tao Liu wrote:
> > A kexec kernel bootup hang is observed on Intel Atom cpu due to unmapped
>
> s/cpu/CPU/g
>
> > EFI config table.
> >
> > Currently EFI system table is identity-mapped for the kexec kernel, but EFI
> > config table is not mapped explicitly:
>
> Why does the EFI config table *need* to be mapped explicitly?
>
> > commit 6bbeb276b71f ("x86/kexec: Add the EFI system tables and ACPI
> > tables to the ident map")
> >
> > Later in the following 2 commits, EFI config table will be accessed when
> > enabling sev at kernel startup.
>
> What does SEV have to do with an Intel problem?

For the 2 questions above. The call stack is follows:

head_64.S:.Lon_kernel_cs(which is before extract_kernel) -> sev_enable
-> snp_init -> find_cc_blob -> find_cc_blob_efi. No matter what cpu,
with CONFIG_AMD_MEM_ENCRYPT enabled, all will enter sev_enable() and
go through these functions. I assume it is harmless for Intel cpu,
normally just exit if sev enable conditions not met. However the efi
config table will be iterated in find_cc_blob_efi(), in order to find
if there is EFI_CC_BLOB_GUID(Confidential Computing blob) in the
vendor table.

>
> > This may result in a page fault due to EFI
> > config table's unmapped address. Since the page fault occurs at an early
> > stage, it is unrecoverable and kernel hangs.
> >
> > commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> > earlier during boot")
> > commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> > detection/setup")
> >
> > In addition, the issue doesn't appear on all systems, because the kexec
> > kernel uses Page Size Extension (PSE) for identity mapping. In most cases,
> > EFI config table can end up to be mapped into due to 1 GB page size.
> > However if nogbpages is set, or cpu doesn't support pdpe1gb feature
> > (e.g Intel Atom x6425RE cpu), EFI config table may not be mapped into
> > due to 2 MB page size, thus a page fault hang is more likely to happen.
>
> This doesn't answer my question above.

Currently the efi config table is not explicitly ident mapped for 2nd
kernel. In a few "unlucky" cases, 2nd kernel will page fault during
find_cc_blob_efi() and unrecoverable, but for most cases, it is
"lucky" with no problem because PSE and pdpe1gb can make config table
mapped into when ident mapping something else.

>
> > This patch will make sure the EFI config table is always mapped.
>
> Avoid having "This patch" or "This commit" in the commit message. It is
> tautologically useless.
>
> Also, do
>
> $ git grep 'This patch' Documentation/process
>
> for more details.

Thanks, I will get it improved in v3.

>
>
> >
> > Signed-off-by: Tao Liu <ltao@xxxxxxxxxx>
> > ---
> > Changes in v2:
> > - Rephrase the change log based on Baoquan's suggestion.
> > - Rename map_efi_sys_cfg_tab() to map_efi_tables().
> > - Link to v1: https://lore.kernel.org/kexec/20230525094914.23420-1-ltao@xxxxxxxxxx/
> > ---
> > arch/x86/kernel/machine_kexec_64.c | 35 ++++++++++++++++++++++++++----
> > 1 file changed, 31 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
> > index 1a3e2c05a8a5..664aefa6e896 100644
> > --- a/arch/x86/kernel/machine_kexec_64.c
> > +++ b/arch/x86/kernel/machine_kexec_64.c
> > @@ -28,6 +28,7 @@
> > #include <asm/setup.h>
> > #include <asm/set_memory.h>
> > #include <asm/cpu.h>
> > +#include <asm/efi.h>
> >
> > #ifdef CONFIG_ACPI
> > /*
> > @@ -86,10 +87,12 @@ const struct kexec_file_ops * const kexec_file_loaders[] = {
> > #endif
> >
> > static int
> > -map_efi_systab(struct x86_mapping_info *info, pgd_t *level4p)
> > +map_efi_tables(struct x86_mapping_info *info, pgd_t *level4p)
> > {
> > #ifdef CONFIG_EFI
> > unsigned long mstart, mend;
> > + void *kaddr;
> > + int ret;
> >
> > if (!efi_enabled(EFI_BOOT))
> > return 0;
> > @@ -105,6 +108,30 @@ map_efi_systab(struct x86_mapping_info *info, pgd_t *level4p)
> > if (!mstart)
> > return 0;
> >
> > + ret = kernel_ident_mapping_init(info, level4p, mstart, mend);
> > + if (ret)
> > + return ret;
> > +
> > + kaddr = memremap(mstart, mend - mstart, MEMREMAP_WB);
> > + if (!kaddr) {
> > + pr_err("Could not map UEFI system table\n");
> > + return -ENOMEM;
> > + }
> > +
> > + mstart = efi_config_table;
>
> Yeah, about this, did you see efi_reuse_config() and the comment above
> it especially?
>
> Or is it that the EFI in that box wants the config table mapped 1:1 and
> accesses it during boot/kexec?

The call stack shows the page fault issue is before the 2nd kernel
extract, which is earlier than what efi_reuse_config() does for kernel
init.

Thanks,
Tao Liu
>
> In any case, this is all cloudy without a proper root cause.
>
> Also, I'd like for Ard to have a look at this too.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>