Re: Avoiding OOM on overcommit...?

From: Jesse Pollard (pollard@tomcat.admin.navo.hpc.mil)
Date: Tue Mar 21 2000 - 14:05:15 EST


James Sutherland <jas88@cam.ac.uk>:
> On Mon, 20 Mar 2000 22:02:33 -0600, you wrote:
>
> >On Mon, 20 Mar 2000, David Whysong wrote:
> >>On Mon, 20 Mar 2000, Jesse Pollard wrote:
> >>
> >>>Without overcommit:
> >>> You can support 100 simultaneous connections, with full saturation of
> >>> each server, with no failures.
> >>>
> >>> Result: happy customers, happy management, maybe even a raise.
> >>>
> >>>If you overcommit:
> >>> You might support 100 simultaneous connections, but not full saturation
> >>> of each server, without aborting some connections or crashing the
> >>> server/system.
> >>>
> >>> Result: some unhappy customers, not so happy management, difficulty
> >>> in identifying problems, and definitely no raise.
> >>>
> >>>So you used 2G byte of swap - I know where you can by 18GB of swap for about
> >>>$300 US...
> >>
> >>That's very misleading. In fact if you give the overcommitted system the
> >>same amount of VM, it will work just fine.
> >>
> >>In other words, turning off overcommit isn't what saves you. You added
> >>more memory!
> >
> >I guaranteed that the memory allocated could be used. I didn't just add
> >more memory. Just adding more memory will still allow the system to fail,
> >it may take longer, it may not happen as often. But it can still happen.
>
> In EVERY case where your non-overcommit system operates correctly, my
> overcommitted machine does too.

True.

> Under heavier load, there is a point where my system continues to
> operate correctly but yours fails with OOM errors.

NOPE. none. Under heavier load the resource quotas prevent a process from
exceeding the permitted limit. the system does not fail.

> Finally, we reach a heavy overload situation, where both systems fail
> with OOM errors.

NOPE again. the only time both will fail is if both are overcommitted.

> Explain why you believe causing the second situation to fail
> unnecessarily is an improvement.

It is not unnecessairy - it has exceeded the allowed limits for that user.
Management has stated that that user is not allowed to interfere with other
users.

> >The quotas prevent someone from exceeding what is really alloted to the
> >job. If the job exceeds that quota then something is wrong, it gives a
> >control point, AND it prevents the "something wrong" from affecting the
> >rest of the system.
>
> Yes, we KNOW you want user resource quotas. You can't have them yet,
> so stop talking about them - they are irrelevant here.

Why "you can't have them yet"? Does that mean it IS planed to put resource
quotas in? This is the first I've ever heard that it was planned.

And I don't think they are irrelevant. If this is not the location to
discuss kernel features, what is?
-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:34 EST