Re: Some questions about linux kernel.

From: James Sutherland (jas88@cam.ac.uk)
Date: Sun Mar 19 2000 - 13:24:47 EST


On Fri, 17 Mar 2000 12:43:26 -0600 (CST), you wrote:
>James Sutherland <jas88@cam.ac.uk>:
>> On Fri, 17 Mar 2000, Jesse Pollard wrote:
>> > Paul Jakma <paul.jakma@compaq.com>
>> > > On Thu, 16 Mar 2000, James Sutherland wrote:
>> > >
>> > > > No. *ANY* memory allocation system can run out of memory. Avoiding
>> > > > "overcommitting" would make the OOM situation arise SOONER (and more
>> > > > frequently), as well as killing performance.
>> > > >
>> > >
>> > > well, it's a more a question of whether you make promises that you might
>> > > not be able to keep. If you do (ie overcommit) then it's your
>> > > (kernel) problem. If you don't, it's not.
>> > >
>> > > Without overcommit the /system/ can run out memory, of course - it's
>> > > finite - but it's no longer a kernel problem.
>> > >
>> > > > Right... Now we'll try this on the university's central Unix system, shall
>> > > > we? Let's see... 6000 users, 2Gb RAM+swap. They get about 300K each.
>> > > > That's ALMOST enough to log in with!
>> > >
>> > > well then get more ram/swap. But at least it has become a hw issue.
>> >
>> > It's still not right - the assumption that you have 6000 concurrent users
>> > on a 2GB based system is unreasonable. It is much more likely that there
>> > is only 40-50 conccurent users. If the limit is 40 then it becomes 2GB/40
>> > or ~50MB per user not 300K. After that what you have is a management decision
>> > on system expansion, AND the data to back that decision. I have worked in
>> > such an environment, and that's the way it is.
>>
>> I never said 6000 CONCURRENT users. BUT the system is perfectly happy
>> without any limits on the number of users. We could have 200 users using
>> an AVERAGE of 5Mb each quite happily - if they are all running the same
>> couple of programs (a typical workload) overcommit makes the difference
>> between needing 2Gb of swap and 20Gb, without *ANY* difference in
>> functionality.
>
>IF the limits established for 200 users is 5 MB then the system will NOT
>OOM. The additional load would not be permitted to start. No current processes
>would be aborted.

Yes they would - when they hit the user's VM quota. In fact, roughly
half the users would be unable to do whatever they are doing ATM - it
would put them over their quota. (I said an AVERAGE of 5Mb.)

>> The enormous overhead disabling overcommit would incur, without ANY
>> benefits, are simply not practical.
>
>Almost no overhead other than administration.

And the vast chunks of unused swap it reserves, "just in case".

> The only overhead is subtracting
>from the available size of the users allocation.

You seem to be confusing overcommit with resource limits.

> Once that limit is reached,
>the USER gets a resonable signal, and only the USERS process would be
>terminated. This has worked quite well in the past, and work quite well on
>current systems with user resource limits. The only time OOM is valid is IF
>management directs that oversubscription of memory/swap is going to be
>tolerated; when the crash occurs then the responsible party is clearly
>identified. The Kernel/OS is not at fault.

This is a resource quota system, which has NOTHING to do with
disabling overcommit.

>> > > > The alternative just isn't practical in any real-world situation, except
>> > > > more-or-less single user boxes.
>> > >
>> > > the alternative is very practical. boxes where oracle runs as one user /
>> > > apache as another for example. You might want to lock down something
>> > > that you know can cause problems.. etc..
>> > >
>> > > > Even then, that single user will still run
>> > > > out of memory - it's just his quota, rather than a physical limit.
>> > > >
>> > >
>> > > obviously.
>> > >
>> > > > So his processes STILL end up dying randomly, but they do it sooner rather
>> > > > than later. Hrm. Wow.
>> > > >
>> > >
>> > > but then it's the users problem - not the OS..
>> > >
>> > > > On any realistically specced box, it is almost zero already.
>> > > >
>> > >
>> > > no it's not. a plain linux box today has a terrible tendency to die if
>> > > an app misbehaves wrt to memory. It's that bad. (unless you think 1GB of
>> > > RAM + 4GB's+ of swap is a reasonable spec for a desktop)
>> > >
>> > > per process limits go a long long way to solving this (easy with
>> > > pam_limits) but not far enough. But per user limits would be a huge
>> > > improvement...
>> > >
>> > > > No, it's a risk with *EVERY* OS.
>> > >
>> > > no it's not. A non overcommiting OS doesn't run out of VM for
>> > > processes. it cleanly grants or denies memory requests. What the app
>> > > does after that is not a kernel problem.
>> >
>> > I agree that is is not a problem with the os.
>> > It can also prevent additional logins if the resources for the new login
>> > are not available to the specific user attempting to log in. This is done
>> > in a LOT of UNIX systems, as long as they have functioning, per user,
>> > resource controls.
>>
>> Oh dear. The problem here is when the system runs out of VM and has to
>> start denying malloc() requests. *NOT* that it has already allocated more
>> VM than it has - it has allocated the maximum amount it is able to, and
>> now needs to claw some back for legitimate processes.
>
>The system wouldn't run out of VM. Only users would run out of their allocation
>of VM. The system would not have a problem with this.

The "system itself" doesn't have a problem anyway - the users do.
However, the system is put there to serve the users; it should protect
users from problems like rogue processes wasting resources.

>> Overcommit is, if anything, a partial SOLUTION here - not part of the
>> problem.
>
>Overcommit is the symptom of the problem, not a solution. OOM crashes and
>random program aborts are the problem. User resource controls are the best
>known solution.
>
>I have seen both happen on any system from an IBM 360 to Cray T90. If swap is
>oversubscribed, then it can crash (or deadlock). The systems that best
>handled resource allocation have been: DEC VAX/VMS, Cray UNICOS. The worst
>has been SGI IRIX and Linux (nether even tell you if you are OOM; other
>than by random crash...).

Which is the problem Rik's patch is designed to address. Rather than
letting random processes crash, kill specific processes and log the
fact.

>I want controlability, and repeatabiltiy. If the system runs out of memory
>I want a message stating that occurance. Even more, I want resource controls
>that will allow me to be able to eliminate it (most draconian) or permit
>it within certain limits; and to know when it happens.

Rik's patch should do the first, and per-user resource limits do the
second. Explain where disabling overcommit comes into this?

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:27 EST