Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter

From: Marco Elver
Date: Sat Nov 20 2021 - 07:17:36 EST


On Mon, Nov 15, 2021 at 02:06PM -0800, Kees Cook wrote:
[...]
> However, that's a lot to implement when Marco's tracing suggestion might
> be sufficient and policy could be entirely implemented in userspace. It
> could be as simple as this (totally untested):
[...]
>
> Marco, is this the full version of monitoring this from the userspace
> side?

Sorry I completely missed this email (I somehow wasn't Cc'd... I just
saw it by chance re-reading this thread).

I've sent a patch to add WARN:

https://lkml.kernel.org/r/20211115085630.1756817-1-elver@xxxxxxxxxx

Not sure how useful BUG is, but I have no objection to it also being
traced if you think it's useful.

(I added it to kernel/panic.c, because lib/bug.c requires
CONFIG_GENERIC_BUG.)

> perf record -e error_report:error_report_end

I think userspace would want something other than perf tool to handle it
of course. There are several options:

1. Open trace pipe to be notified (/sys/kernel/tracing/trace_pipe).
This already includes the pid.

2. As you suggest, use perf events globally (but the handling
would be done by some system process).

3. As of 5.13 there's actually a new perf feature to
synchronously SIGTRAP the exact task where an event occurred
(see perf_event_attr::sigtrap). This would very closely mimic
pkill_on_warn (because the SIGTRAP is synchronous), but lets the
process being SIGTRAP'd decide what to do. Not sure how to
deploy this though, because a) only root user can create this
perf event (because exclude_kernel=0), and b) sigtrap perf
events deliberately won't propagate beyond an exec
(must remove_on_exec=1 if sigtrap=1) because who knows if
the exec'd process has the right SIGTRAP handler.

I think #3 is hard to deploy right, but below is an example program I
played with.

Thanks,
-- Marco

------ >8 ------

#define _GNU_SOURCE
#include <assert.h>
#include <stdio.h>
#include <linux/perf_event.h>
#include <signal.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/syscall.h>
#include <unistd.h>

static void sigtrap_handler(int signum, siginfo_t *info, void *ucontext)
{
// FIXME: check event is error_report_end
printf("Kernel error in this task!\n");
}
static void generate_warning(void)
{
... do something to generate a warning ...
}
int main()
{
struct perf_event_attr attr = {
.type = PERF_TYPE_TRACEPOINT,
.size = sizeof(attr),
.config = 189, // FIXME: error_report_end
.sample_period = 1,
.inherit = 1, /* Children inherit events ... */
.remove_on_exec = 1, /* Required by sigtrap. */
.sigtrap = 1, /* Request synchronous SIGTRAP on event. */
.sig_data = 189, /* FIXME: use to identify error_report_end */
};
struct sigaction action = {};
struct sigaction oldact;
int fd;
action.sa_flags = SA_SIGINFO | SA_NODEFER;
action.sa_sigaction = sigtrap_handler;
sigemptyset(&action.sa_mask);
assert(sigaction(SIGTRAP, &action, &oldact) == 0);
fd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, PERF_FLAG_FD_CLOEXEC);
assert(fd != -1);
sleep(5); /* Try to generate a warning from elsewhere, nothing will be printed. */
generate_warning(); /* Warning from this process. */
sigaction(SIGTRAP, &oldact, NULL);
close(fd);
return 0;
}