Re: [PATCH] core_pattern: add CPU specifier

From: Eric W. Biederman
Date: Wed Sep 07 2022 - 18:01:25 EST


Oleg Nesterov <oleg@xxxxxxxxxx> writes:

> On 09/07, Oleksandr Natalenko wrote:
>>
>> The advantage of having CPU recorded in the file name is that
>> in case of multiple cores one can summarise them with a simple
>> ls+grep without invoking a fully-featured debugger to find out
>> whether the segfaults happened on the same CPU.
>
> Besides, if you only need to gather the statistics about the faulting
> CPU(s), you do not even need to actually dump the the core. For example,
> something like
>
> #!/usr/bin/sh
>
> echo $* >> path/to/coredump-stat.txt
>
> and
> echo '| path-to-script-above %C' >/proc/sys/kernel/core_pattern
>
> can help.

So I am confused. I thought someone had modified print_fatal_signal
to print this information. Looking at the code now I don't see it,
but perhaps that is in linux-next somewhere.

That would seem to be the really obvious place to put this and much
closer to the original fault so we ware more likely to record the
cpu on which things actually happened on.

If we don't care about the core dump just getting the information in
syslog where it can be analyzed seems like the thing to do.

For a developers box putting it in core pattern makes sense, isn't a
hinderance to use. For anyone else's box the information needs to come
out in a way that allows automated tools to look for a pattern.
Requiring someone to take an extra step to print the information seems
a hinderance to automated tools doing the looking.

Eric