Re: Overcommitable memory??

From: Jesse Pollard (pollard@tomcat.admin.navo.hpc.mil)
Date: Wed Mar 22 2000 - 08:21:44 EST


James Sutherland <jas88@cam.ac.uk>
> On Tue, 21 Mar 2000 12:58:37 -0600 (CST), you wrote:
> >James Sutherland <jas88@cam.ac.uk>:
> >> On 20 Mar 2000 15:19:23 -0800, you wrote:
> >>
> >> >In article <linux.kernel.fr1ddskpd1mnfr9gvjmnm8op9237gq61pd@4ax.com>,
> >> >James Sutherland <jas88@cam.ac.uk> wrote:
> >> >
> >> >>Unfortunately, this would break a lot of code which would depend on
> >> >>the current (perfectly reasonable) implementation of malloc() and
> >> >>stack space - namely, memory is only allocated when you use it.
> >> >
> >> > No, it wouldn't -- that code come pre-broken for your sysadminning
> >> > dispair.
> >>
> >> You are free to rewrite it all to fit your own replacement API if you
> >> like - that's the bit I'd try to avoid, though. There is NOTHING
> >> "broken" about that - it's a very reasonable behaviour.
> >>
> >> >>If you really want your code to occupy unused space, just touch the
> >> >>space when you allocate it. End of problem.
> >> >
> >> > Unless, of course, you want to do something other than have some
> >> > random process die when you run out of memory.
> >>
> >> They are not random. IF your process is too big to fit in the machine
> >> at the time you run it, it gets killed. Nothing wrong with that.
> >
> >But at the time the process started it there was space. Some time later
> >your buggy process used up the rest and killed it.
>
> No. If my process comes along after yours, then grabs a big chunk of
> memory, MY buggy app gets killed, to protect yours.

It may not be a big chunk. It only needs to take more than what is
remaining, and puts the system in the OOM condition.

> If your app started, immediately grabbed a chunk of memory, then slept
> for a while, while mine comes along, does a load of calculations and
> then allocates a buffer to put the results in, your process does get
> killed to make room for mine. Legitimately so, IMO - it's MY process
> which is doing something useful in this case, normally.

It may not be sleeping. It all depends on the application.

My process may have allocated a large buffer, does a load of calucations
and tries to put the results in the buffer. Many languages do this - FORTRAN
allocates large buffers on open, but they aren't used immediately because
the data hasn't been either computed or needed to be read in. Other situations
occur too - a large buffer may be allocated on file open, short reads are used
for some data, then a large read. In the current situation, the large read
may abort with an OOM, because someone else happend to do a large read first.

> >> > given
> >> >
> >> > char *foo = malloc(GIGABYTE(1));
> >> >
> >> > it's a lot easier to check to see if that memory is there by doing
> >> >
> >> > if (foo == 0) {
> >> > /* our out of memory processing */
> >> > }
> >> >
> >> > than to do the suggested
> >> >
> >> > long q;
> >> >
> >> > for (q = 0; q < GIGABYTE(1); q += magic_number_to_dirty_pages)
> >> > foo[q] = 0;
> >> >
> >> > /* if we get here, the malloc worked. If we're really lucky,
> >> > enough of the system survived the memory allocation so that
> >> > we can continue. */
> >> >
> >> > or the slower
> >> >
> >> > memset(foo, 0, GIGABYTE(1));
> >> >
> >> > /* if we get here, the malloc worked. If we're really lucky,
> >> > enough of the system survived the memory allocation so that
> >> > we can continue. */
> >> >
> >> > methods for really and truly allocating memory.
> >>
> >> Question: WTF do you want to hog a huge block of memory you don't use?
> >> THAT, IMO, is a TRULY broken process. You are trying to write a memory
> >> leak!
> >
> >Lets just say that it was going to be a large simulation and it was
> >initializing the data arrays.
>
> If just initialising the data arrays puts the system short of memory,
> your process shouldn't be running at the moment anyway - your quota
> should prevent it anyway, if you had a quota set properly.

If we were working without overcommitment, yes. The example was using
the current situation, no quotas. The simulation was going to use
80% of the system your job ended up wanting 30%. The wrong job got killed.
It doesn't matter which one. The system could not control it. If the
simulation was the valid job, then yours should have been terminated.
If yours was valid, then then the simulation. Without quotas there is no
way to determine which.

> Don't start huge simulations when the system is busy - that's
> precisely when your process SHOULD be killed!

At the time it started the system wasn't busy.
-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:36 EST