Re: [Patch 8/8] kexec: allow to shrink reserved memory

From: Amerigo Wang
Date: Wed Aug 12 2009 - 23:31:19 EST


Eric W. Biederman wrote:
Amerigo Wang <amwang@xxxxxxxxxx> writes:

This patch implements shrinking the reserved memory for crash kernel,
if it is more than enough.

For example, if you have already reserved 128M, now you just want 100M,
you can do:

# echo $((100*1024*1024)) > /sys/kernel/kexec_crash_size

Getting closer (comments inline)

Semantically this patch is non-contriversial and pretty
simple, but still needs a fair amount of review. Can
you put this patch at the front of your patch set.


Sure, I will do it when I resend them next time.

I add mm people into Cc.
Index: linux-2.6/kernel/kexec.c
===================================================================
--- linux-2.6.orig/kernel/kexec.c
+++ linux-2.6/kernel/kexec.c
@@ -1083,6 +1083,76 @@ void crash_kexec(struct pt_regs *regs)
}
}
+int kexec_crash_kernel_loaded(void)
+{
+ int ret;
+ if (!mutex_trylock(&kexec_mutex))
+ return 1;

We don't need trylock on this code path

OK.

+ ret = kexec_crash_image != NULL;
+ mutex_unlock(&kexec_mutex);
+ return ret;
+}
+
+size_t get_crash_memory_size(void)
+{
+ size_t size;
+ if (!mutex_trylock(&kexec_mutex))
+ return 1;

We don't need trylock on this code path


Hmm, crashk_res is a global struct, so other process can also
change it... but currently no process does that, right?

+ size = crashk_res.end - crashk_res.start + 1;
+ mutex_unlock(&kexec_mutex);
+ return size;
+}
+
+int shrink_crash_memory(unsigned long new_size)
+{
+ struct page **pages;
+ int ret = 0;
+ int npages, i;
+ unsigned long addr;
+ unsigned long start, end;
+ void *vaddr;
+
+ if (!mutex_trylock(&kexec_mutex))
+ return -EBUSY;

We don't need trylock on this code path

We are missing the check to see if the crash_kernel is loaded
under this lock instance. So I please move the kexec_crash_image != NULL
test inline here and kill the kexec_crash_kernel_loaded function.

Ok, no problem.

+ start = crashk_res.start;
+ end = crashk_res.end;
+
+ if (new_size >= end - start + 1) {
+ ret = -EINVAL;
+ if (new_size == end - start + 1)
+ ret = 0;
+ goto unlock;
+ }
+
+ start = roundup(start, PAGE_SIZE);
+ end = roundup(start + new_size, PAGE_SIZE) - 1;
+ npages = (end + 1 - start ) / PAGE_SIZE;
+
+ pages = kmalloc(sizeof(struct page *) * npages, GFP_KERNEL);
+ if (!pages) {
+ ret = -ENOMEM;
+ goto unlock;
+ }
+ for (i = 0; i < npages; i++) {
+ addr = end + 1 + i * PAGE_SIZE;
+ pages[i] = virt_to_page(addr);
+ }
+
+ vaddr = vm_map_ram(pages, npages, 0, PAGE_KERNEL);

This is the wrong kernel call to use. I expect this needs to look
like a memory hotplug event. This does not put the pages into the
free page pool.

Well, I also wanted to use an memory-hotplug API, but that will make the code depend on memory-hotplug, which certainly is not what we want...

I checked the mm code, actually what I need is an API which is similar to add_active_range(), but add_active_range() can't be used here since it is marked as "__init".

Do we have that kind of API in mm? I can't find one.

Thanks!


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/