RE: hv_hypercall_pg page permissios

From: Vitaly Kuznetsov
Date: Mon Jun 15 2020 - 04:35:29 EST


Dexuan Cui <decui@xxxxxxxxxxxxx> writes:

>> From: linux-hyperv-owner@xxxxxxxxxxxxxxx
>> <linux-hyperv-owner@xxxxxxxxxxxxxxx> On Behalf Of Andy Lutomirski
>> Sent: Tuesday, April 7, 2020 2:01 PM
>> To: Christoph Hellwig <hch@xxxxxx>
>> Cc: vkuznets <vkuznets@xxxxxxxxxx>; x86@xxxxxxxxxx;
>> linux-hyperv@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; KY Srinivasan
>> <kys@xxxxxxxxxxxxx>; Stephen Hemminger <stephen@xxxxxxxxxxxxxxxxxx>;
>> Andy Lutomirski <luto@xxxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Subject: Re: hv_hypercall_pg page permissios
>>
>>
>> > On Apr 7, 2020, at 12:38 AM, Christoph Hellwig <hch@xxxxxx> wrote:
>> >
>> > ïOn Tue, Apr 07, 2020 at 09:28:01AM +0200, Vitaly Kuznetsov wrote:
>> >> Christoph Hellwig <hch@xxxxxx> writes:
>> >>
>> >>> Hi all,
>> >>>
>> >>> The x86 Hyper-V hypercall page (hv_hypercall_pg) is the only allocation
>> >>> in the kernel using __vmalloc with exectutable persmissions, and the
>> >>> only user of PAGE_KERNEL_RX. Is there any good reason it needs to
>> >>> be readable? Otherwise we could use vmalloc_exec and kill off
>> >>> PAGE_KERNEL_RX. Note that before 372b1e91343e6 ("drivers: hv: Turn
>> off
>> >>> write permission on the hypercall page") it was even mapped writable..
>> >>
>> >> [There is nothing secret in the hypercall page, by reading it you can
>> >> figure out if you're running on Intel or AMD (VMCALL/VMMCALL) but it's
>> >> likely not the only possible way :-)]
>> >>
>> >> I see no reason for hv_hypercall_pg to remain readable. I just
>> >> smoke-tested
>> >
>> > Thanks, I have the same in my WIP tree, but just wanted to confirm this
>> > makes sense.
>>
>> Just to make sure weâre all on the same page: x86 doesnât normally have an
>> execute-only mode. Executable memory in the kernel is readable unless you
>> are using fancy hypervisor-based XO support.
>
> Hi hch,
> The patch is merged into the mainine recently, but unluckily we noticed
> a warning with CONFIG_DEBUG_WX=y (it looks typically this config is defined
> by default in Linux distros, at least in Ubuntu 18.04's
> /boot/config-4.18.0-11-generic).
>
> Should we revert this patch, or figure out a way to ask the DEBUG_WX code to
> ignore this page?
>

Are you sure it is hv_hypercall_pg? AFAIU it shouldn't be W+X as we
are allocating it with vmalloc_exec(). In other words, if you revert
78bb17f76edc, does the issue go away?

> [ 19.387536] debug: unmapping init [mem 0xffffffff82713000-0xffffffff82886fff]
> [ 19.431766] Write protecting the kernel read-only data: 18432k
> [ 19.438662] debug: unmapping init [mem 0xffffffff81c02000-0xffffffff81dfffff]
> [ 19.446830] debug: unmapping init [mem 0xffffffff821d6000-0xffffffff821fffff]
> [ 19.522368] ------------[ cut here ]------------
> [ 19.527495] x86/mm: Found insecure W+X mapping at address 0xffffc90000012000
> [ 19.535066] WARNING: CPU: 26 PID: 1 at arch/x86/mm/dump_pagetables.c:248 note_page+0x639/0x690
> [ 19.539038] Modules linked in:
> [ 19.539038] CPU: 26 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #1
> [ 19.539038] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
> [ 19.539038] RIP: 0010:note_page+0x639/0x690
> [ 19.539038] Code: fe ff ff 31 c0 e9 a0 fe ff ff 80 3d 39 d1 31 01 00 0f 85 76 fa ff ff 48 c7 c7 98 55 0a 82 c6 05 25 d1 31 01 01 e8 f7 c9 00 00 <0f> 0b e9 5c fa ff ff 48 83 c0 18 48 c7 45 68 00 00 00 00 48 89 45
> [ 19.539038] RSP: 0000:ffffc90003137cb0 EFLAGS: 00010282
> [ 19.539038] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000007
> [ 19.539038] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff810fa9c4
> [ 19.539038] RBP: ffffc90003137ea0 R08: 0000000000000000 R09: 0000000000000000
> [ 19.539038] R10: 0000000000000001 R11: 0000000000000000 R12: ffffc90000013000
> [ 19.539038] R13: 0000000000000000 R14: ffffc900001ff000 R15: 0000000000000000
> [ 19.539038] FS: 0000000000000000(0000) GS:ffff8884dad00000(0000) knlGS:0000000000000000
> [ 19.539038] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 19.539038] CR2: 0000000000000000 CR3: 0000000002210001 CR4: 00000000003606e0
> [ 19.539038] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 19.539038] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 19.539038] Call Trace:
> [ 19.539038] ptdump_pte_entry+0x39/0x40
> [ 19.539038] __walk_page_range+0x5b7/0x960
> [ 19.539038] walk_page_range_novma+0x7e/0xd0
> [ 19.539038] ptdump_walk_pgd+0x53/0x90
> [ 19.539038] ptdump_walk_pgd_level_core+0xdf/0x110
> [ 19.539038] ? ptdump_walk_pgd_level_debugfs+0x40/0x40
> [ 19.539038] ? hugetlb_get_unmapped_area+0x2f0/0x2f0
> [ 19.703692] ? rest_init+0x24d/0x24d
> [ 19.703692] ? rest_init+0x24d/0x24d
> [ 19.703692] kernel_init+0x2c/0x113
> [ 19.703692] ret_from_fork+0x24/0x30
> [ 19.703692] irq event stamp: 2840666
> [ 19.703692] hardirqs last enabled at (2840665): [<ffffffff810fa9c4>] console_unlock+0x444/0x5b0
> [ 19.703692] hardirqs last disabled at (2840666): [<ffffffff81001ec9>] trace_hardirqs_off_thunk+0x1a/0x1c
> [ 19.703692] softirqs last enabled at (2840662): [<ffffffff81c00366>] __do_softirq+0x366/0x490
> [ 19.703692] softirqs last disabled at (2840655): [<ffffffff8107dba8>] irq_exit+0xe8/0x100
> [ 19.703692] ---[ end trace 99ca90806a8e657c ]---
> [ 19.786235] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> [ 19.793298] rodata_test: all tests were successful
> [ 19.798508] x86/mm: Checking user space page tables
> [ 19.818007] x86/mm: Checked W+X mappings: passed, no W+X pages found.
>
> Thanks,
> -- Dexuan

--
Vitaly