Re: Overcomittable memory

From: James Sutherland (jas88@cam.ac.uk)
Date: Mon Mar 20 2000 - 15:04:35 EST


On Mon, 20 Mar 2000 19:04:06 +0000, you wrote:
>James Sutherland wrote:
>> On Sun, 19 Mar 2000 14:39:31 -0600, you wrote:
>> >On Sun, 19 Mar 2000, James Sutherland wrote:
>> >>On Sat, 18 Mar 2000 12:45:35 -0600, you wrote:
>> >>>On Fri, Mar 17, 2000 at 10:10:23AM +0000, James Sutherland wrote:
>> >>>> Yes, you COULD have COW without overcommitting - but you still lose one of
>> >>>> the major benefits of COW, namely huge savings on VM usage. If I fork()
>> >>>> 100 Apache processes of 20Mb each, I need perhaps 30Mb of VM total.
>> >>>> WITHOUT overcommitted COW, I end up needing 2Gb of swap space - 1.98Gb of
>> >>>> which I will never use! This is certainly not an efficient use of swap
>> >>>> space, IMO...
>> >>>
>> >>>This is why god gave us segments. Can share the code; just have room for
>> >>>the data.
>> >>>
>> >>>Actually, I suppose it would be possible to know how much is code not
>> >>>likely to change (runtime loadable modules), and not have to commit for
>> >>>that.
>> >>
>> >>Or just commit based on the memory which is really being used by the
>> >>process, which is nice and simple, and hasn't caused any problems I
>> >>know of yet.
>> >>
>> >>It works - why change it?
>> >
>> >Because it doesn't work. Systems crash. Systems do reboot. Reliability
>> >is just not there for production server capability.
>>
>> When have you seen a Linux box crash simply because all the
>> application memory was in use? I have never seen this, and it
>> certainly shouldn't happen...
>
>> What is really happening is that a BUG in the kernel is
>> rebooting/crashing the system under particular circumstances. Fix that
>> bug, not the circumstances under which it shows up.
>
>I've seen this happen on several PCs, where all you do is malloc lots of
>memory and fill it. It'll hard lock after a while. Yes, it's a bug,
>but...

So fix the bug, don't change the system to avoid showing it!

>> Disabling overcommit does NOT prevent the system running out of
>> memory. It exacerbates it.
>
>This does not in any way follow.

Disabling overcommit just means your process grabs more memory
earlier. Instead of being allocated the memory it uses when it uses
it, it is allocated enough to fill the entire address space it
reserved. No benefits there.

>Blindingly simple example app:
>
>1) syscall - sys_overcommit(off) == this process will not overcommit

No, that's NOT what that syscall does (despite the name...)

In fact, setting it >0 simply disables ALL sanity checking on malloc.
You want 2Gb on a 4Mb 386, malloc() succeeds. A fairly specialist
setting, really.

>2) Allocate 512MB of memory with brk(). May fail if this can't be 100%
>guaranteed. No problem, the user just ran it all of a second ago.

This does NOT depend on (and indeed is not affected by) overcommit,
provided you touch the memory.

>3) mmap() a result file, say 1GB size. Map flags are MAP_SHARED,
>PROT_WRITE. May fail if the kernel cannot allocate structures. Ditto, no
>problem.
>
>4) Main loop, never allocate any memory during runtime. Use the 512MB as
>a heap. Or better yet, don't do any dynamic allocation. In fact, don't
>do any syscalls at all. Run here for a few weeks...

Just run for a few weeks without logging or journalling anything. Hrm.
Bright design. Supposing there is a power failure or system crash
during this time?

>5) Write results into mapped result file. msync()
>
>Would you not agree that once stage 4 is reached the program CANNOT
>POSSIBLY DIE for any reason whatsoever?

Nope. I just poured coffee over the CPU. Only Nortel and Tandem
hardware handles that gracefully :-)

> That is, so long as the 512MB
>requested by brk() are reserved. No copy, fill or faulting required.
>Overcommit apps can happily allocate loads of memory on top of this, but
>they'll just die when memory is depleted to what was reserved.

This is true with or without overcommit - they will just die
earlier/more often without it.

>It's the ability to request non-overcommit that's essential, and I don't
>see what you have against it. With the current scheme of things this is
>impossible.

No it isn't. Just touch the memory on allocation.

>(Yes, msync() can fail, but that's "easily" fixable with reserved pages)

All in all, a very sloppy design. If you look at sensible number
crunching apps (Distributed.net and SETI@home are common examples)
they checkpoint themselves frequently. Any serious app should do this;
if it doesn't, that's just poor programming on your part.

If you want non-overcommitted memory, just touch it on allocation.
That's it. No need to mangle the kernel or anything else.

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:30 EST