Re: Some questions about linux kernel.

From: Stefan Monnier (monnier+lists/linux/kernel/news/@RUM.CS.YALE.EDU)
Date: Mon Mar 13 2000 - 02:53:09 EST


>>>>> "Andrea" == Andrea Arcangeli <andrea@suse.de> writes:
> The heuristic has _no-way_ to find out which is the hog. This in turn mean
> that you can kill several wrong tasks before you finally kill the right
> one and so it's useless and worse than what we have now as far I can tell.

"The hog" cannot be found with certainty. It's simply not possible
since a no matter how you define the "hogity", I can come up with
an example of a perfectly sane process that does exactly that.

What it means is that you're down to a heuristic where the only thing
you care about is "how often does it work right in practice ?".

Rik's algorithm can probably be improved, but experience shows it
works rather well (more importantly: much better than the current
heuristic of killing whichever process happens to request memory
at the wrong time).

> Once we'll be able to find out which is the hog there will be no need to
> look the informations that you are using in your task-selection
> algorithm. We know we have to kill the hog, despite of its
> euid/priority/lifetime etc...

That only applies if you're 100% quite sure it is indeed the hog.
Killing X might render the whole box unusable, so you'd better be
really really sure.

> The idea I had a few weeks ago to solve the problem and so to find out the
> hog (and that I'll experiment in real life 2.3.x soon) is to add a
> per-task page fault rate (ala avg_slice).

If this info is useful for other things, great. But else, I'd rather not
pay for such an overhead: OOM almost never happens and the few times it's
happened, Rik's hack did the job brilliantly without any overhead during
normal operation (apart from a marginally larger kernel).

> Once we'll know the page fault rate and the time of the last fault per each
> process, we'll be almost able to find out the memory hog without possible
> mistakes and we won't need anything else.

"Almost" is the operative word here.
I don't want to discourage you, but I think that we should start with
Rik's heuristic and then we can think about improving it (if it turns out
to be necessary) by providing it with more data such as past behavior,
or by simply using a fancier analysis of the current state. I'm sure
that we can improve the heuristic while still only using the already
available info. We could take into account the number of dirty pages,
the "age" of each page, the location inside the process-tree, the
age of parents/children and their memory usage, ...
But based on my experience, I'd say that improving Rik's patch won't
really be necessary.

        Stefan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Mar 15 2000 - 21:00:23 EST