Re: [RFC PATCH] accouting: account if a task was killed by OOM killer

From: Xiaotian Feng
Date: Thu Jan 14 2010 - 06:00:17 EST


On 01/14/2010 06:54 AM, Andrew Morton wrote:
On Mon, 11 Jan 2010 14:40:34 +0800
Xiaotian Feng<dfeng@xxxxxxxxxx> wrote:

This patch introduces a new accounting flag which is set when a task
was killed by OOM killer. taskstats can tell users when a job has been
killed by the oomkiller.


Why is this useful? I'd be looking for a description of some
operational scenario where this feature is valuable to an operator?


users of taskstats need to know if a job is killed by OOM killer, then perform some automation jobs or notifications.
But current taskstats logs AXSIG if a job is killed by signal, so users will be confused by SIGKILL, SIGTERM or OOM killer.

The description is incomplete. The patch also alters the contents of
the BSD accounting records. That's a change to an ancient interface
and needs a bit of exposure and thought. Is it good to put such a
highly linux-specific and somewhat linux-version-specific field into
such a venerable userspace interface?

If we _do_ decide to change the BSD accounting records in this manner
then presumably a manpage will need to be updated. A cc to
linux-api@xxxxxxxxxxxxxxx would be appropriate.

The BSD accounting part is not necessary, I just made it same as taskstats, we can drop BSD accounting part.


But I'm not very convinced about this whole idea at present, personally.

include/linux/acct.h | 1 +
include/linux/taskstats.h | 2 +-
kernel/acct.c | 2 ++
kernel/tsacct.c | 2 ++

I'm a bit surprised that getdelays.c doesn't print ac_flag.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/