Re: [PATCH 12/16] UML - Memory hotplug

From: Andrew Morton
Date: Fri Mar 24 2006 - 17:41:03 EST


Jeff Dike <jdike@xxxxxxxxxxx> wrote:
>
> This adds hotplug memory support to UML. The mconsole syntax is
> config mem=[+-]n[KMG]
> In other words, add or subtract some number of kilobytes, megabytes, or
> gigabytes.
>
> Unplugged pages are allocated and then madvise(MADV_REMOVE), which is
> a currently experimental madvise extension. These pages are tracked so
> they can be plugged back in later if the admin decides to give them back.
> The first page to be unplugged is used to keep track of about 4M of other
> pages. A list_head is the first thing on this page. The rest is filled
> with addresses of other unplugged pages. This first page is not madvised,
> obviously.
> When this page is filled, the next page is used in a similar way and linked
> onto a list with the first page. Etc.
> This whole process reverses when pages are plugged back in. When a tracking
> page no longer tracks any unplugged pages, then it is next in line for
> plugging, which is done by freeing pages back to the kernel.
>
> This patch also removes checking for /dev/anon on the host, which is obsoleted
> by MADVISE_REMOVE.
>
> ...
>
> +static unsigned long long unplugged_pages_count = 0;

The `= 0;' causes this to consume space in vmlinux's .data. If we put it
in bss and let crt0.o take care of zeroing it, we save a little disk space.


> + page = alloc_page(GFP_ATOMIC);

That's potentially quite a few atomically-allocated pages. I guess UML is
more resistant to oom than normal kernels (?) but it'd be nice to be able to
run page reclaim here.

> + char buf[sizeof("18446744073709551615\0")];

rofl. We really ought to have a #define for "this architecture's maximum
length of an asciified int/long/s32/s64". Generally people do
guess-and-giggle-plus-20%, or they just get it wrong.

> +#ifndef MADV_REMOVE
> +#define MADV_REMOVE 0x5 /* remove these pages & resources */
> +#endif
> +
> +int os_drop_memory(void *addr, int length)
> +{
> + int err;
> +
> + err = madvise(addr, length, MADV_REMOVE);
> + if(err < 0)
> + err = -errno;
> + return 0;
> +}

* NOTE: Currently, only shmfs/tmpfs is supported for this operation.
* Other filesystems return -ENOSYS.

Are you expecting that this memory is backed by tmpfs?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/