Re: [PATCHSET] printk, netconsole: implement reliable netconsole

From: Tetsuo Handa
Date: Sat Apr 18 2015 - 09:09:50 EST


Tejun Heo wrote:
> > If we can assume that scheduler is working, adding a kernel thread that
> > does
> >
> > while (1) {
> > read messages with metadata from /dev/kmsg
> > send them using UDP network
> > }
> >
> > might be easier than modifying netconsole module.
>
> But, I mean, if we are gonna do that in kernel, we better do it
> properly where it belongs. What's up with "easier than modifying
> netconsole module"? Why is netconsole special? And how would the
> above be any less complex than a single timer function? What am I
> missing?

User space daemon is sometimes disturbed unexpectedly due to

(a) SIGKILL by OOM-killer
(b) spurious ptrace() by somebody
(c) spurious signals such as SIGSTOP / SIGINT
(d) stalls triggered by page faults under OOM condition
(e) other problems such as scheduler being not working

We have built-in protection for (a) named /proc/$pid/oom_score_adj , but
we need to configure access control modules for protecting (b) and (c),
and we don't have protection for (d). Thinking from OOM stall discussion,
(d) is fatal when trying to obtain kernel messages under problematic
condition. I thought that a kernel thread that does

while (1) {
read messages with metadata from /dev/kmsg
send them using UDP network
}

is automatically protected from (a), (b), (c) and (d), and it could be
implemented outside of netconsole module.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/