Re: [PATCH v5 1/4] kaslr: add immovable_mem=nn[KMG]@ss[KMG] to specify extracting memory

From: Chao Fan
Date: Thu Jan 04 2018 - 21:58:58 EST


On Thu, Jan 04, 2018 at 06:30:57PM +0800, Baoquan He wrote:
>On 01/04/18 at 04:02pm, Chao Fan wrote:
>> In current code, kaslr may choose the memory region in movable
>> nodes to extract kernel, which will make the nodes can't be hot-removed.
>> To solve it, we can specify the memory region in immovable node.
>> Create immovable_mem to store the regions in immovable_mem, where should
>> be chosen by kaslr.
>>
>> Also change the "handle_mem_memmap" to "handle_mem_filter", since
>> it will not only handle memmap parameter now.
>> Since "immovable_mem=" only works with "movable_node", so "immovable_mem="
>> doesn't work alone. If specify "movable_node" without "immovable_mem=",
>> disable KASLR.
>>
>> Multiple regions can be specified, comma delimited.
>> Considering the usage of memory, only support for 4 regions.
>> 4 regions contains 2 nodes at least, enough for kernel to extract.
>>
>> Signed-off-by: Chao Fan <fanc.fnst@xxxxxxxxxxxxxx>
>
>Hi Chao,
>
>Thanks for your effort on this issue.
>
>Luiz told me they met a hugetlb issue when kaslr enabled on kvm guest.
>Please check the below bug information. There's only one available
>position which hugepage can use to allocate. In this case, if we have a
>generic parameter to tell kernel where we can randomize into, this
>hugepage issue can be solved. We can restrict kernel to randomize beyond
>[0x40000000, 0x7fffffff]. Not sure if your immovable_mem=nn[KMG]@ss[KMG]
>can be adjusted to do this. I am hesitating on whether we should change
>this or not.
>

Hi Baoquan, Luiz,

In my personal understanding, there is only one region,
[0x40000000, 0x7fffffff] suitable for the 1G page, so we should avoid
kaslr to choose this region, right?

If my understanding is right, I think it's more similar with mem_avoid.
Because we specify where KASLR *choose* in "immovable_mem=", we specify
where KASLR *avoid* in "mem_avoid".
So I wonder if it's OK to expand mem_avoid, and add a member like
MEM_AVOID_HUGEPAGE in "enum mem_avoid_index".
But there is a disadvantage, we can only specify the limited regions.

Luiz, I am not familiar with HUGE PAGE, I wonder how many 1G HUGE pages
does system need in general? We may need to limit it in 2, or 4.

Thanks,
Chao Fan

>Hi maintainers, Kees,
>
>1) Let's keep Chao's current code, just ask Luiz to use nokaslr to work
>around the hugepage allocation failure;
>2) Change immovable_mem=nn[KMG]@ss[KMG] to be a generic parameter,
>people can use it to restrict kernel to places they want.
>
>Which one is better? Or any other idea or suggestion?
>
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>Luiz said on kvm guest, they add the following to the kernel command-line:
>default_hugepagesz=1G hugepagesz=1G hugepages=1
>
>Boot the guest and check number of 1GB pages reserved:
>
>grep HugePages_Total /proc/meminfo
>
>When booting with "nokaslr" HugePages_Total is always 1. When booting
>without "nokaslr" sometimes HugePages_Total is zero (that is, reserving
>the 1GB page fails).
>
>And 20 reboots, 6 failures to mount single 1G hugepage.
>
>I reproduced on kvm guest with Luiz's help, and found it's because there's
>only one available position for 1G hugepage allocation, [0x40000000, 0x7fffffff].
>That's why they saw 1/4 possibility of failure. If kernel randomized to
>[0x0, 0x3fffffff], [0x80000000, 0xbffdffff], or [0x100000000,
>0x13fffffff], hugepage can always succeed to allocate.
>
>dmesg output snippet of kvm guest:
>[ +0.000000] e820: BIOS-provided physical RAM map:
>[ +0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
>[ +0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
>[ +0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
>[ +0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffdffff] usable
>[ +0.000000] BIOS-e820: [mem 0x00000000bffe0000-0x00000000bfffffff] reserved
>[ +0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
>[ +0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
>[ +0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000013fffffff] usable
>
>Thanks
>Baoquan
>
>> ---
>> arch/x86/boot/compressed/kaslr.c | 112 +++++++++++++++++++++++++++++++++++++--
>> 1 file changed, 109 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>> index 8199a6187251..60e5aa28b510 100644
>> --- a/arch/x86/boot/compressed/kaslr.c
>> +++ b/arch/x86/boot/compressed/kaslr.c
>> @@ -108,6 +108,19 @@ enum mem_avoid_index {
>>
>> static struct mem_vector mem_avoid[MEM_AVOID_MAX];
>>
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +/* Only supporting at most 4 immovable memory regions with kaslr */
>> +#define MAX_IMMOVABLE_MEM 4
>> +
>> +static bool lack_immovable_mem;
>> +
>> +/* Store the memory regions in immovable node */
>> +static struct mem_vector immovable_mem[MAX_IMMOVABLE_MEM];
>> +
>> +/* The immovable regions user specify, not more than 4 */
>> +static int num_immovable_region;
>> +#endif
>> +
>> static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
>> {
>> /* Item one is entirely before item two. */
>> @@ -206,15 +219,97 @@ static void mem_avoid_memmap(char *str)
>> memmap_too_large = true;
>> }
>>
>> -static int handle_mem_memmap(void)
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +static int parse_immovable_mem(char *p,
>> + unsigned long long *start,
>> + unsigned long long *size)
>> +{
>> + char *oldp;
>> +
>> + if (!p)
>> + return -EINVAL;
>> +
>> + oldp = p;
>> + *size = memparse(p, &p);
>> + if (p == oldp)
>> + return -EINVAL;
>> +
>> + switch (*p) {
>> + case '@':
>> + *start = memparse(p + 1, &p);
>> + return 0;
>> + default:
>> + /*
>> + * If w/o offset, only size specified, immovable_mem=nn[KMG]
>> + * has the same behaviour as immovable_mem=nn[KMG]@0. It means
>> + * the region starts from 0.
>> + */
>> + *start = 0;
>> + return 0;
>> + }
>> +
>> + return -EINVAL;
>> +}
>> +
>> +static void parse_immovable_mem_regions(char *str)
>> +{
>> + static int i;
>> +
>> + while (str && (i < MAX_IMMOVABLE_MEM)) {
>> + int rc;
>> + unsigned long long start, size;
>> + char *k = strchr(str, ',');
>> +
>> + if (k)
>> + *k++ = 0;
>> +
>> + rc = parse_immovable_mem(str, &start, &size);
>> + if (rc < 0)
>> + break;
>> + str = k;
>> +
>> + immovable_mem[i].start = start;
>> + immovable_mem[i].size = size;
>> + i++;
>> + }
>> + num_immovable_region = i;
>> +}
>> +#else
>> +static inline void parse_immovable_mem_regions(char *str)
>> +{
>> +}
>> +#endif
>> +
>> +static int handle_mem_filter(void)
>> {
>> char *args = (char *)get_cmd_line_ptr();
>> size_t len = strlen((char *)args);
>> + bool enable_movable_node = false;
>> char *tmp_cmdline;
>> char *param, *val;
>> u64 mem_size;
>>
>> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> + if (strstr(args, "movable_node")) {
>> + /*
>> + * Confirm "movable_node" specified, otherwise
>> + * "immovable_mem=" doesn't work.
>> + */
>> + enable_movable_node = true;
>> +
>> + /*
>> + * If only specify "movable_node" without "immovable_mem=",
>> + * disable KASLR.
>> + */
>> + if (!strstr(args, "immovable_mem=")) {
>> + lack_immovable_mem = true;
>> + return 0;
>> + }
>> + }
>> +#endif
>> +
>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>> + !enable_movable_node)
>> return 0;
>>
>> tmp_cmdline = malloc(len + 1);
>> @@ -239,6 +334,9 @@ static int handle_mem_memmap(void)
>>
>> if (!strcmp(param, "memmap")) {
>> mem_avoid_memmap(val);
>> + } else if (!strcmp(param, "immovable_mem=") &&
>> + enable_movable_node) {
>> + parse_immovable_mem_regions(val);
>> } else if (!strcmp(param, "mem")) {
>> char *p = val;
>>
>> @@ -378,7 +476,7 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>> /* We don't need to set a mapping for setup_data. */
>>
>> /* Mark the memmap regions we need to avoid */
>> - handle_mem_memmap();
>> + handle_mem_filter();
>>
>> #ifdef CONFIG_X86_VERBOSE_BOOTUP
>> /* Make sure video RAM can be used. */
>> @@ -673,6 +771,14 @@ static unsigned long find_random_phys_addr(unsigned long minimum,
>> return 0;
>> }
>>
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> + /* Check if specify "movable_node" without "immovable_mem=". */
>> + if (lack_immovable_mem) {
>> + debug_putstr("Fail KASLR when movable_node specified without immovable_mem=.\n");
>> + return 0;
>> + }
>> +#endif
>> +
>> /* Make sure minimum is aligned. */
>> minimum = ALIGN(minimum, CONFIG_PHYSICAL_ALIGN);
>>
>> --
>> 2.14.3
>>
>>
>>
>
>