Re: Virtual vs. physical swap & shared memory forks (clone)

From: Linda Walsh (law@sgi.com)
Date: Sat Mar 25 2000 - 13:06:47 EST


        First -- forget the 'sproc' thing. Someone turned me on to "clone"
which I was unfamiliar with that solves that problem.

> The "malloc() not returning NULL" argument is complete nonsense. It's not
> worth throwing away a useful feature (overcommit) with concrete benefits
> over this.
>
> In fact, malloc() not returning NULL isn't a bug, it's a feature. It lets
> you do things that you couldn't otherwise accomplish, for no cost.

---
	The man page says malloc will return a NULL on failure.  Define
failure.  Please explain what this failure could be if not 'insufficient memory'.

> > So it dynamically reserves 256 more pages than it is currently using, > >so it will be 256 pages away from being out of memory when it realizes there > >is a problem? So kernel running out of memory shouldn't ever happen -- as > >it should always have a 256-page buffer more than it is currently using. > > In principle there is no limit to how much memory the kernel might want to > allocate for itself. I suspect that people doing video capture or DRI/GLX > stuff might want more than 256 pages. -- That would be the responsibility of a driver to be well-behaved. Ideally, for arbitrary video capture buffer sizes, that space should be allocated by the user level program -- like buffered file I/O in libc -- that's all in user-space, not kernel space.

> > > Note, I'm serious about the above CAP's. Again -- if you want to > >protect your 'X', you can make sure it runs with the "don't kill cap". > > Ok, I see. I misinterpreted that as a "cap on user space", instead of a > capability. --- Nep. Sorry for my lazy typing. We need to expand that CAP set to 64 bits "real soon". I know IRIX has a fairly mature Capability system that is only using about 45 or so bits out of 64. 64 should last us for a while and it could still be done with existing macros as gcc seems to support 64 bit data types on most (all?) modern cpu's.

> It's a pretty good heuristic. The algorithm is: > > badness = memory used / sqrt(CPU time) / fourth root(run time) > then badness is doubled if the task is niced > if uid or euid are zero, (or capability of sysadmin) badness is > divided by 4 > if the process has direct hardware access, badness is divided by 2 > > Then the process with the highest badness is killed. --- That is a good heuristic if you want to simulate determinism. I used to come up with such formulae when I wanted to simulate "human" type behavior in a Dungeons and Dragons game I wrote. But I suspect it while there is a deterministic formula behind the actions, it would still appear non-deterministic to the average user.

Having the expectation of 2 simple behaviours -- returning 'NULL' when out of memory, or, if the sysadmin decides to use virtual memory, then a well defined behavior of either deadlock or killing processes (maybe using the above algorithm) would allow per-system policy to be set to control out of memory conditions.

> When you run out of memory, overcommit or not, tasks are going to die. --- No. Current behavior on IRIX is suspending the tasks asking for more memory resulting in possible deadlock if all processes are suspended -- unlikely but possible. That's a caveat w/using virtual memory. I'd support a system configuration ability to allow choice between suspend or kill behaviors.

> Removing overcommit might make malloc() return null, but that's only one > of a host of ways to allocate memory. The other methods don't have a > return value. So arguing that "overcommit is bad, because it breaks the > malloc() return value" is pointless. --- What other methods? calloc - ENOMEM, open <object>, ENOMEM, fork: ENOMEM. Etc. All what you would expect if there was NOMEM.

> The problem here is that the system is out of memory. This has nothing to > do with overcommitment. The solution is to make the kernel more > intelligent about killing processes when OOM. Anything else just hides the > problem. --- You are assumming I want a system that kills anything. Any kills other than explicit kills by a user of their processes is a denial of service. It means you have a system with low integrity/reliability. This is just plain unacceptable for mission critical systems.

For your system, since you aren't running mission critical stuff, you can choose to allocate 1G of Vmem, and implement 'killing' on the above formula. But the *simple* behavior should be the most predictable -- that of "no mem" resulting in ENOMEM on calls asking for memory.

-linda

-- Linda A Walsh | Trust Technology, Core Linux, SGI law@sgi.com | Voice: (650) 933-5338

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:15 EST