Re: 4.4.-rc5: lguest causes ugly warn on: 5 W+X pages found

From: Rusty Russell
Date: Tue Dec 15 2015 - 21:25:30 EST


Pavel Machek <pavel@xxxxxx> writes:
> Hi!
>
>> > or similar?
>> >
>> > The above is entirely untested. Maybe it doesn't compile. Or
>> > boot. Or work.
>>
>> Well, with two extra spaces at each line, it does not apply :-).
>>
>> I applied it by hand, and the output is:
>>
>> [ 0.000000] MTRR variable ranges enabled:
> ...> [ 0.000000] BRK [0x0566c000, 0x0566cfff] PGTABLE
>>
>> I'll take a look if I can figure out what it means...
>
> Wait, there's more in the log.
>
> [ 1.952146] Bluetooth: HCI UART protocol H4 registered
> [ 1.954335] Bluetooth: HCI UART protocol BCSP registered
> [ 1.956750] usbcore: registered new interface driver btusb
> [ 1.958953] ------------[ cut here ]------------
> [ 1.961149] WARNING: CPU: 1 PID: 1 at
> ./arch/x86/include/asm/pgtable.h:357
> vmap_page_range_noflush+0x1f0/0x280()
> [ 1.963511] Modules linked in:
> [ 1.965849] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W
> 4.4.0-rc5+ #137
> [ 1.968230] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW
> (2.19 ) 03/31/2011
> [ 1.970593] 00000001 00000000 f5cffe64 c42baaf8 00000000 f5cffe80
> c404066b 00000165
> [ 1.973103] c40fbe70 00000163 00000000 00000000 f5cffe90 c404070f
> 00000009 00000000
> [ 1.975670] f5cffee0 c40fbe70 c4f88348 00000000 ffe6dfff ffe6e000
> c4f8a018 ffe6dfff
> [ 1.978304] Call Trace:
> [ 1.980882] [<c42baaf8>] dump_stack+0x41/0x59
> [ 1.983464] [<c404066b>] warn_slowpath_common+0x6b/0xa0
> [ 1.986053] [<c40fbe70>] ? vmap_page_range_noflush+0x1f0/0x280
> [ 1.988625] [<c404070f>] warn_slowpath_null+0xf/0x20
> [ 1.991154] [<c40fbe70>] vmap_page_range_noflush+0x1f0/0x280
> [ 1.993676] [<c40fbf2b>] map_vm_area+0x2b/0x40
> [ 1.996153] [<c4f2c795>] init+0xf8/0x1a4
> [ 1.998591] [<c4f2c69d>] ? edac_init+0x67/0x67
> [ 2.001014] [<c4000442>] do_one_initcall+0xc2/0x1c0
> [ 2.003391] [<c4f044e3>] ? initcall_blacklist+0x97/0x97
> [ 2.005815] [<c4f044e3>] ? initcall_blacklist+0x97/0x97
> [ 2.008161] [<c4051546>] ?
> __usermodehelper_set_disable_depth+0x36/0x40
> [ 2.010518] [<c407d4a6>] ? up_write+0x16/0x40
> [ 2.012817] [<c4f04ba3>] kernel_init_freeable+0xf0/0x16d
> [ 2.015078] [<c4f04ba3>] ? kernel_init_freeable+0xf0/0x16d
> [ 2.017386] [<c4a4d9c8>] kernel_init+0x8/0xc0
> [ 2.019661] [<c4a54149>] ret_from_kernel_thread+0x21/0x38
> [ 2.021932] [<c4a4d9c0>] ? rest_init+0xa0/0xa0
> [ 2.024168] ---[ end trace e117245cd61feaf2 ]---
> [ 2.026383] lguest: mapped switcher at ffe69000
> [ 2.028958] sdhci: Secure Digital Host Controller Interface driver
>
> ...which I don't understand; did not we say warn on _once_?
> ... Um. But I think we have a winner: "lguest: mapped switcher at
> ffe69000".
>
> Rusty, does the switcher need to be W+X?
>
> And yes, I have lguest enabled, not sure why.

No. The layout is "<text page> <per-cpu-stack-pages>..." and I lazily
did that as a single
map_vm_area(switcher_vma, PAGE_KERNEL_EXEC, lg_switcher_pages);

This boots, does it solve the problem?

Thanks!
Rusty.

From: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
Subject: lguest: map switcher text R/O.

Pavel noted that lguest maps the switcher code executable and
read-write. This is a bad idea for any kernel text, but particularly
for text mapped at a fixed address.

Create two vmas, one for the text (PAGE_KERNEL_RX) and another for the
stacks (PAGE_KERNEL). Use VM_NO_GUARD to map them adjacent (as
expected by the rest of the code).

Reported-by: Pavel Machek <pavel@xxxxxx>
Signed-off-by: Rusty Russell <rusty@xxxxxxxxxxxxxxx>

diff --git a/arch/x86/include/asm/lguest.h b/arch/x86/include/asm/lguest.h
index 3bbc07a57a31..73d0c9b92087 100644
--- a/arch/x86/include/asm/lguest.h
+++ b/arch/x86/include/asm/lguest.h
@@ -12,7 +12,9 @@
#define GUEST_PL 1

/* Page for Switcher text itself, then two pages per cpu */
-#define TOTAL_SWITCHER_PAGES (1 + 2 * nr_cpu_ids)
+#define SWITCHER_TEXT_PAGES (1)
+#define SWITCHER_STACK_PAGES (2 * nr_cpu_ids)
+#define TOTAL_SWITCHER_PAGES (SWITCHER_TEXT_PAGES + SWITCHER_STACK_PAGES)

/* Where we map the Switcher, in both Host and Guest. */
extern unsigned long switcher_addr;
diff --git a/drivers/lguest/core.c b/drivers/lguest/core.c
index 312ffd3d0017..021915baef35 100644
--- a/drivers/lguest/core.c
+++ b/drivers/lguest/core.c
@@ -22,7 +22,8 @@

unsigned long switcher_addr;
struct page **lg_switcher_pages;
-static struct vm_struct *switcher_vma;
+static struct vm_struct *switcher_text_vma;
+static struct vm_struct *switcher_stacks_vma;

/* This One Big lock protects all inter-guest data structures. */
DEFINE_MUTEX(lguest_lock);
@@ -83,54 +84,80 @@ static __init int map_switcher(void)
}

/*
+ * Copy in the compiled-in Switcher code (from x86/switcher_32.S).
+ * It goes in the first page, which we map in momentarily.
+ */
+ memcpy(kmap(lg_switcher_pages[0]), start_switcher_text,
+ end_switcher_text - start_switcher_text);
+ kunmap(lg_switcher_pages[0]);
+
+ /*
* We place the Switcher underneath the fixmap area, which is the
* highest virtual address we can get. This is important, since we
* tell the Guest it can't access this memory, so we want its ceiling
* as high as possible.
*/
- switcher_addr = FIXADDR_START - (TOTAL_SWITCHER_PAGES+1)*PAGE_SIZE;
+ switcher_addr = FIXADDR_START - TOTAL_SWITCHER_PAGES*PAGE_SIZE;

/*
- * Now we reserve the "virtual memory area" we want. We might
- * not get it in theory, but in practice it's worked so far.
- * The end address needs +1 because __get_vm_area allocates an
- * extra guard page, so we need space for that.
+ * Now we reserve the "virtual memory area"s we want. We might
+ * not get them in theory, but in practice it's worked so far.
+ *
+ * We want the switcher text to be read-only and executable, and
+ * the stacks to be read-write and non-executable.
*/
- switcher_vma = __get_vm_area(TOTAL_SWITCHER_PAGES * PAGE_SIZE,
- VM_ALLOC, switcher_addr, switcher_addr
- + (TOTAL_SWITCHER_PAGES+1) * PAGE_SIZE);
- if (!switcher_vma) {
+ switcher_text_vma = __get_vm_area(PAGE_SIZE, VM_ALLOC|VM_NO_GUARD,
+ switcher_addr,
+ switcher_addr + PAGE_SIZE);
+
+ if (!switcher_text_vma) {
err = -ENOMEM;
printk("lguest: could not map switcher pages high\n");
goto free_pages;
}

+ switcher_stacks_vma = __get_vm_area(SWITCHER_STACK_PAGES * PAGE_SIZE,
+ VM_ALLOC|VM_NO_GUARD,
+ switcher_addr + PAGE_SIZE,
+ switcher_addr + TOTAL_SWITCHER_PAGES * PAGE_SIZE);
+ if (!switcher_stacks_vma) {
+ err = -ENOMEM;
+ printk("lguest: could not map switcher pages high\n");
+ goto free_text_vma;
+ }
+
/*
* This code actually sets up the pages we've allocated to appear at
* switcher_addr. map_vm_area() takes the vma we allocated above, the
- * kind of pages we're mapping (kernel pages), and a pointer to our
- * array of struct pages.
+ * kind of pages we're mapping (kernel text pages and kernel writable
+ * pages respectively), and a pointer to our array of struct pages.
*/
- err = map_vm_area(switcher_vma, PAGE_KERNEL_EXEC, lg_switcher_pages);
+ err = map_vm_area(switcher_text_vma, PAGE_KERNEL_RX, lg_switcher_pages);
+ if (err) {
+ printk("lguest: text map_vm_area failed: %i\n", err);
+ goto free_vmas;
+ }
+
+ err = map_vm_area(switcher_stacks_vma, PAGE_KERNEL,
+ lg_switcher_pages + SWITCHER_TEXT_PAGES);
if (err) {
- printk("lguest: map_vm_area failed: %i\n", err);
- goto free_vma;
+ printk("lguest: stacks map_vm_area failed: %i\n", err);
+ goto free_vmas;
}

/*
* Now the Switcher is mapped at the right address, we can't fail!
- * Copy in the compiled-in Switcher code (from x86/switcher_32.S).
*/
- memcpy(switcher_vma->addr, start_switcher_text,
- end_switcher_text - start_switcher_text);
-
printk(KERN_INFO "lguest: mapped switcher at %p\n",
- switcher_vma->addr);
+ switcher_text_vma->addr);
/* And we succeeded... */
return 0;

-free_vma:
- vunmap(switcher_vma->addr);
+free_vmas:
+ /* Undoes map_vm_area and __get_vm_area */
+ vunmap(switcher_stacks_vma->addr);
+free_text_vma:
+ vunmap(switcher_text_vma->addr);
free_pages:
i = TOTAL_SWITCHER_PAGES;
free_some_pages:
@@ -148,7 +175,8 @@ static void unmap_switcher(void)
unsigned int i;

/* vunmap() undoes *both* map_vm_area() and __get_vm_area(). */
- vunmap(switcher_vma->addr);
+ vunmap(switcher_text_vma->addr);
+ vunmap(switcher_stacks_vma->addr);
/* Now we just need to free the pages we copied the switcher into */
for (i = 0; i < TOTAL_SWITCHER_PAGES; i++)
__free_pages(lg_switcher_pages[i], 0);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/