Re: Linux 2.2.15pre12 [VM fixes]

From: Andrea Arcangeli (andrea@suse.de)
Date: Thu Mar 02 2000 - 19:09:53 EST


On Fri, 3 Mar 2000, Alan Cox wrote:

> [..] You back out the stuff that tends to avoid killing the
>X server for example.

Such stuff looks messy (there are conditions that triggers after one
minute, that's obviously wrong code) and it's calling try_to_free_pages
lots of times in function of page_cluster that has anything to do with
that. page-cluster is the swapin read-ahead exponent of 2, it has anything
to do with oom killing.

>We also need to guarantee atomic memory pools are not being eaten away
>from under network drivers and are replenished promptly. Since that trips

My patch keeps waking up kswapd at the right time for atomic allocations
as I was doing as soon as I heard from you we had atomic allocation
troubles.

>bugs in a couple of drivers if we don't it is really important we keep
>that property.

The only case where I _may_ hide an oops due a missing check for NULL is
when the machine runs _out_of_memory_. That means when you start getting
task killed. The difference is that I allows MID allocation to access the
whole freelist. I seen zero report of people getting Oopses after the
machine runs out of memory thus I think I am not hiding _anything_. Do you
seen such oops reports during out of memory conditions? I certainly like
to know this.

In all cases I prefer that we take only the pre-patches with such change
and that we backout such change for production since backing it out it's
conceptually more correct IMHO. This because if the machine runs out of
memory there's no point to allow only the atomic allocations to access the
remaining memory.

And all such missing check may trigger also with my code if some atomic
allocation eats all the freelist. That can always happen unfortunately :(.

>So I think its 'differently broken'

I don't think I am hiding missing check for null during oom since I
haven't seen reports like that with 2.2.15pre.

>Other than that I have no fundamental problems with the suggestion. The
>trashing heuristic seems to work.

Ok ;).

>[..] keeping the atomic pool well (and promptly) fed [..]

I am doing that since you reported the atomic allocation problem (2.3.x is
just fixed too).

> [..] and also
>not shooting root/iopl processes unless we have no choice ?

What I'd like to do in this area is not obviously right and it needs some
more research. I disagree with Rik's heuristic since I strongly believe we
should kill the hog, and we shouldn't kill the netscape of an user if
inetd hits a bug, and runs out of memory while exploding. (ignoring the
current loop with page-cluster and the conditions that triggers after one
second). I prefer to stay on 2.3.x for these things.

What we just have in 2.2.14 is that if the task is running with iopl
privilegies, we are sending it sigterm instead of sigkill so that it can
restore the console while exiting.

But with 2.2.14 if a malicious X clients cause the X server to be the real
hog, then 2.2.14 may keep sending sigterm (not sigkill) to X, and if
there's not enough memory to run the sigterm signal handler the machine
deadlock (unfortunately this is a real life problem, excuse me).

So in the new patch we try to send sigterm only for 10 times (and we
sched_yield since the second sigterm in the hope the malicious X client
will be killed instead) and then we send sigkill at the 10th try.
Something like this try-sigterm-first is necessary also for any kind of
oom_killer and it's orthogonal with the task selection. The current
implementation I did is very dirty but it's also the less intrusive one (a
few lines of code that are obviously right). I suggest to do a more
generalized one (so that we can be careful also with ioperm privilegies
also on non IA32 archs) in 2.3.x.

Thanks,
Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Mar 07 2000 - 21:00:15 EST