Re: Overcommitable memory??

From: James Sutherland (jas88@cam.ac.uk)
Date: Tue Mar 21 2000 - 07:10:53 EST


On Mon, 20 Mar 2000 21:47:26 -0600, you wrote:
>On Mon, 20 Mar 2000, David Whysong wrote:
>>On Mon, 20 Mar 2000, James Sutherland wrote:
>>>On Mon, 20 Mar 2000 05:39:48 -0600, you wrote:
>>>>On Sun, 19 Mar 2000, David Whysong wrote:
>>>
>>>Just send SIGTERM. This gives them an opportunity to exit gracefully.
>>>If they ignore it, SIGKILL them.
>>>
>>>>>Once we are OOM, you can't give user-space any choices.
>>>>
>>>>YOU ARN'T OOM - a specific user is out of resources, not a catastrophic
>>>>failure.
>>
>>A user being out of (memory) resources shares many of the same problems of
>>a system that is OOM. You still need to kill one or more user processes,
>>and you can't give the user the choice, because the user might decide to
>>keep on running.
>
>The user process (one of them) has been aborted. As long as the user
>remains below the limit, the user can keep on running.

This is what happens when the system is OOM, too. No advantage there.

>>>SOMETHING (the user, the process, the system, the cluster, whatever)
>>>is out of resources, so something has to give. Which unit has run out
>>>of resources doesn't matter - the issue is how we handle running out
>>>of resources in some category.
>>
>>Precisely.
>>
>>>>>I don't like resource limits. Using resource limits is similar to not
>>>>>having memory overcommit -- you waste a lot of system resources "just in
>>>>>case", the kernel needs to do a lot more accounting, and it's just
>>>>>horribly inefficient.
>>>>
>>>>Resource limits CAN prevent the OOM condition if
>>>> 1. the sum of all concurrent users is <= total resources
>>>> 2. users are not allowed to exceed their quota
>>>
>>>That's an extremely restrictive approach, but appropriate in some
>>>cases. We need those resource limits - but what's this got to do with
>>>overcommit??
>>
>>Preventing system OOM using resource limits is equivalent to disabling
>>overcommit. You have to restrict each of N users to 1/N of the total
>>system memory.
>
>NO. Different users can have different limits. The total of the concurrent
>users must not exceed the amount of the resources. If it is decided to do
>so then you are overcommiting the resource,

Be careful; this is a TOTALLY different use of "overcommitting" than
we are using in this thread.

> which may be valid in many
>cases (single user workstations are one, but there you may not want to
>enable quotas anyway).
>
>1/N is the worst way to set quotas. It is one way to divide a users
>quota over his process group, but I'm not exactly in favor of that either.
>
>1. Determine how large the individual processes have to be to work. Use
> the worst case processes.
>2. Determine how many users use the worst case process at the same time.
>3. Determine how many users can run at the same time.
>
>This cycle is done repeatedly to identify time intervals, attended vs
>unattended operation. Then a proposed scheduling policy that allows
>the best mix of operations that is acceptable to the user community.
>
>This may call for running large processes at night - with few interactive
>users. It may allow for many interactive users during the day, but none
>of them may be able to run the worst case process then (it may take the
>entire system for the process + its parent shell and the rest of the
>system processes. NO OTHER USERS.
>
>What ever is finally decided by the the users and management. It should
>not be dictated by the OS.

True.

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:32 EST