Re: [PATCH] coredump debugging: add a tracepoint to report the coredumping

From: Wen Yang
Date: Sun Feb 18 2024 - 10:35:11 EST



On 2024/2/17 18:49, Oleg Nesterov wrote:
On 02/17, wenyang.linux@xxxxxxxxxxx wrote:
From: Wen Yang <wenyang.linux@xxxxxxxxxxx>

Currently coredump_task_exit() takes some time to wait for the generation
of the dump file. But if the user-space wants to receive a notification
as soon as possible it maybe inconvenient.

Add the new trace_sched_process_coredump() into coredump_task_exit(),
this way a user-space monitor could easily wait for the exits and
potentially make some preparations in advance.
Can't comment, I never know when the new tracepoint will make sense.

Stupid question.
Oleg.

Thanks for your help.

trace_sched_process_exit() is located after the PF_EXITING flag is set,
so it could not be moved to there.
Could we make the following modifications?

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index dbb01b4b7451..53e9420540dc 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -334,6 +334,13 @@ DEFINE_EVENT(sched_process_template, sched_process_exit,
             TP_PROTO(struct task_struct *p),
             TP_ARGS(p));

+/*
+ * Tracepoint for killing a task by a signal:
+ */
+DEFINE_EVENT(sched_process_template, sched_process_kill,
+            TP_PROTO(struct task_struct *p),
+            TP_ARGS(p));
+
 /*
  * Tracepoint for waiting on task to unschedule:
  */
diff --git a/kernel/signal.c b/kernel/signal.c
index 9b40109f0c56..571342799824 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2866,6 +2866,7 @@ bool get_signal(struct ksignal *ksig)
                 * Anything else is fatal, maybe with a core dump.
                 */
                current->flags |= PF_SIGNALED;
+               trace_sched_process_kill(current);

                if (sig_kernel_coredump(signr)) {
                        if (print_fatal_signals)

--

Best wishes,

Wen



Signed-off-by: Wen Yang <wenyang.linux@xxxxxxxxxxx>
Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx
---
include/trace/events/sched.h | 7 +++++++
kernel/exit.c | 1 +
2 files changed, 8 insertions(+)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index dbb01b4b7451..ce7448065986 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -334,6 +334,13 @@ DEFINE_EVENT(sched_process_template, sched_process_exit,
TP_PROTO(struct task_struct *p),
TP_ARGS(p));

+/*
+ * Tracepoint for a task coredumping:
+ */
+DEFINE_EVENT(sched_process_template, sched_process_coredump,
+ TP_PROTO(struct task_struct *p),
+ TP_ARGS(p));
+
/*
* Tracepoint for waiting on task to unschedule:
*/
diff --git a/kernel/exit.c b/kernel/exit.c
index 493647fd7c07..c11e12d73f4e 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -425,6 +425,7 @@ static void coredump_task_exit(struct task_struct *tsk)
self.next = xchg(&core_state->dumper.next, &self);
else
self.task = NULL;
+ trace_sched_process_coredump(tsk);
/*
* Implies mb(), the result of xchg() must be visible
* to core_state->dumper.
--
2.25.1