Re: Overcommitable memory??

From: Jesse Pollard (pollard@tomcat.admin.navo.hpc.mil)
Date: Fri Mar 17 2000 - 11:52:55 EST


"Alan Curry" <pacman-kernel@cqc.com>:
> James Sutherland writes the following:
> >
> >On 15 Mar 2000, Rask Ingemann Lambertsen wrote:
> >> Not at all. COW is a performance optimisation which does not depend on
> >> overcommitment of memory in any way. Why would you want to turn it off?
> >
> >Because it *IS* overcommitment of memory. You can have two processes, each
> >with their 200Mb of data, in a machine with 256Mb RAM+swap, quite happily
> >- until they start writing to it, at which point you discover you have
> >overcommitted your memory, and things go wrong.
>
> Just because you can describe an example scenario in which COW and
> overcommit are both used, does not mean that they are inseparable. You can do
> COW by simply *reserving* RAM or swap space at fork() time and copying data
> into it later. That is COW without overcommit.
>
> Unfortunately nobody with the necessary skills seems interested in
> implementing it that way.

Just doing this does force the administrator to give a very large amount
of swap space to the system. Currently, there is no way to tell fork not
to make such reservations (but only sometimes...). If the desired sequence
is fork()/exec() then the fork doesn't have to reserve anything more than
some stack space (and only one or two pages at that). Anything else
causes/permits the OOM condition.

I wonder if it could be coded as
    fork() --- reserve one or two pages for anticipated fork.
    on next page fault or syscall -- If page fault or non-exec syscall,
                        reserve the entire worst case memory amount.
                If syscall is exec then allocate/reserve memory for the
                        new image.

The pagefault must be a COW page that is not one of the already reserved
stack pages for the fork...

An alternative would be to reserve (say) 10 pages. Anytime the new process
exceeds this reserved amount (via COW) the entire process size must be reserved.
If this exceeds the users resource limit, then the process is aborted and
the parent process recieves the child termination signal of OOM (either
a per user resource limit signal or a system signal.
If an exec system call is called first, then the new process starts with
the reserve allocation for the new image.

Memory allocation (via sbrk or whatever) would always reserve the additional
memory allocated (or free reserved memory if deallocating, even though noone
does that).

On second thought - I like the alternative better:
a. It requires no coding change in applications
b. It should be relatively straight forward implementation
c. transparent to users, except when they run out of virtual memory.
-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:23 EST