[tip:x86/boot] x86/boot/KASLR: Work around firmware bugs by excluding EFI_BOOT_SERVICES_* and EFI_LOADER_* from KASLR's choice

From: tip-bot for Naoya Horiguchi
Date: Thu Aug 31 2017 - 16:06:47 EST


Commit-ID: 0982adc746736a313dac9cb8cc936ca51ca3741a
Gitweb: http://git.kernel.org/tip/0982adc746736a313dac9cb8cc936ca51ca3741a
Author: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
AuthorDate: Mon, 28 Aug 2017 16:30:59 +0900
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Thu, 31 Aug 2017 12:00:35 +0200

x86/boot/KASLR: Work around firmware bugs by excluding EFI_BOOT_SERVICES_* and EFI_LOADER_* from KASLR's choice

There's a potential bug in how we select the KASLR kernel address n
the early boot code.

The KASLR boot code currently chooses the kernel image's physical memory
location from E820_TYPE_RAM regions by walking over all e820 entries.

E820_TYPE_RAM includes EFI_BOOT_SERVICES_CODE and EFI_BOOT_SERVICES_DATA
as well, so those regions can end up hosting the kernel image. According to
the UEFI spec, all memory regions marked as EfiBootServicesCode and
EfiBootServicesData are available as free memory after the first call
to ExitBootServices(). I.e. so such regions should be usable for the
kernel, per spec.

In real life however, we have workarounds for broken x86 firmware,
where we keep such regions reserved until SetVirtualAddressMap() is done.

See the following code in should_map_region():

static bool should_map_region(efi_memory_desc_t *md)
{
...
/*
* Map boot services regions as a workaround for buggy
* firmware that accesses them even when they shouldn't.
*
* See efi_{reserve,free}_boot_services().
*/
if (md->type =3D=3D EFI_BOOT_SERVICES_CODE ||
md->type =3D=3D EFI_BOOT_SERVICES_DATA)
return false;

This workaround suppressed a boot crash, but potential issues still
remain because no one prevents the regions from overlapping with kernel
image by KASLR.

So let's make sure that EFI_BOOT_SERVICES_{CODE|DATA} regions are never
chosen as kernel memory for the workaround to work fine.

Furthermore, EFI_LOADER_{CODE|DATA} regions are also excluded because
they can be used after ExitBootServices() as defined in EFI spec.

As a result, we choose kernel address only from EFI_CONVENTIONAL_MEMORY
which is the only memory type we know to be safely free.

Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
Cc: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
Cc: Baoquan He <bhe@xxxxxxxxxx>
Cc: Junichi Nomura <j-nomura@xxxxxxxxxxxxx>
Cc: Kees Cook <keescook@xxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Garnier <thgarnie@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: fanc.fnst@xxxxxxxxxxxxxx
Cc: izumi.taku@xxxxxxxxxxxxxx
Link: http://lkml.kernel.org/r/20170828074444.GC23181@xxxxxxxxxxxxxxxxxxxxxxxxxxxx
[ Rewrote/fixed/clarified the changelog and the in code comments. ]
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
arch/x86/boot/compressed/kaslr.c | 40 +++++++++++++++++++++++++++++++---------
1 file changed, 31 insertions(+), 9 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 7de23bb..17818ba 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -597,19 +597,41 @@ process_efi_entries(unsigned long minimum, unsigned long image_size)
for (i = 0; i < nr_desc; i++) {
md = efi_early_memdesc_ptr(pmap, e->efi_memdesc_size, i);
if (md->attribute & EFI_MEMORY_MORE_RELIABLE) {
- region.start = md->phys_addr;
- region.size = md->num_pages << EFI_PAGE_SHIFT;
- process_mem_region(&region, minimum, image_size);
efi_mirror_found = true;
-
- if (slot_area_index == MAX_SLOT_AREA) {
- debug_putstr("Aborted EFI scan (slot_areas full)!\n");
- break;
- }
+ break;
}
}

- return efi_mirror_found;
+ for (i = 0; i < nr_desc; i++) {
+ md = efi_early_memdesc_ptr(pmap, e->efi_memdesc_size, i);
+
+ /*
+ * Here we are more conservative in picking free memory than
+ * the EFI spec allows:
+ *
+ * According to the spec, EFI_BOOT_SERVICES_{CODE|DATA} are also
+ * free memory and thus available to place the kernel image into,
+ * but in practice there's firmware where using that memory leads
+ * to crashes.
+ *
+ * Only EFI_CONVENTIONAL_MEMORY is guaranteed to be free.
+ */
+ if (md->type != EFI_CONVENTIONAL_MEMORY)
+ continue;
+
+ if (efi_mirror_found &&
+ !(md->attribute & EFI_MEMORY_MORE_RELIABLE))
+ continue;
+
+ region.start = md->phys_addr;
+ region.size = md->num_pages << EFI_PAGE_SHIFT;
+ process_mem_region(&region, minimum, image_size);
+ if (slot_area_index == MAX_SLOT_AREA) {
+ debug_putstr("Aborted EFI scan (slot_areas full)!\n");
+ break;
+ }
+ }
+ return true;
}
#else
static inline bool