Re: [PATCH] tracing: Include PPIN in mce_record tracepoint

From: Tony Luck
Date: Tue Jan 23 2024 - 19:12:30 EST


On Tue, Jan 23, 2024 at 05:51:50PM -0600, Avadhut Naik wrote:
> Machine Check Error information from struct mce is exported to userspace
> through the mce_record tracepoint.
>
> Currently, however, the PPIN (Protected Processor Inventory Number) field
> of struct mce is not exported through the tracepoint.
>
> Export PPIN through the tracepoint as it may provide useful information
> for debug and analysis.

Awesome. I've been meaning to update the tracepoint for ages, but
it never gets to the top of the queue.

But some questions:

1) Are tracepoints a user visible ABI? Adding a new field in the middle
feels like it might be problematic. I asked this question many years
ago and Steven Rostedt said there was some tracing library in the works
that would make this OK for appplications using that library.

2) While you are adding to the tracepoint, should we batch up all
the useful changes that have been made to "struct mce". I think the
new fields that might be of use are:

__u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
__u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
__u64 ppin; /* Protected Processor Inventory Number */
__u32 microcode; /* Microcode revision */

>
> Signed-off-by: Avadhut Naik <avadhut.naik@xxxxxxx>
> ---
> include/trace/events/mce.h | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
> index 1391ada0da3b..657b93ec8176 100644
> --- a/include/trace/events/mce.h
> +++ b/include/trace/events/mce.h
> @@ -25,6 +25,7 @@ TRACE_EVENT(mce_record,
> __field( u64, ipid )
> __field( u64, ip )
> __field( u64, tsc )
> + __field( u64, ppin )
> __field( u64, walltime )
> __field( u32, cpu )
> __field( u32, cpuid )
> @@ -45,6 +46,7 @@ TRACE_EVENT(mce_record,
> __entry->ipid = m->ipid;
> __entry->ip = m->ip;
> __entry->tsc = m->tsc;
> + __entry->ppin = m->ppin;
> __entry->walltime = m->time;
> __entry->cpu = m->extcpu;
> __entry->cpuid = m->cpuid;
> @@ -55,7 +57,7 @@ TRACE_EVENT(mce_record,
> __entry->cpuvendor = m->cpuvendor;
> ),

.. rest of patch trimmed.

-Tony