Re: [PATCH 5/6] firmware/efi: Process CXL Component Events

From: Ira Weiny
Date: Fri Dec 08 2023 - 13:48:03 EST


Dan Williams wrote:
> Ira Weiny wrote:

[snip]

> > +
> > +int register_cxl_cper_notifier(struct notifier_block *nb)
> > +{
> > + return blocking_notifier_chain_register(&cxl_cper_chain_head, nb);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(register_cxl_cper_notifier, CXL);
> > +
> > +void unregister_cxl_cper_notifier(struct notifier_block *nb)
> > +{
> > + blocking_notifier_chain_unregister(&cxl_cper_chain_head, nb);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(unregister_cxl_cper_notifier, CXL);
>
> So I am struggling with why is this a notifier chain vs something
> simpler and more explicit, something like:
>
> typedef (int)(*cxl_cper_event_fn)(struct cper_cxl_event_rec *rec)
>
> int register_cxl_cper(cxl_cper_event_fn func)
> {
> guard(rwsem_write)(cxl_cper_rwsem);
> if (cxl_cper_event)
> return -EBUSY;
> cxl_cper_event = func;
> return 0;
> }

This is easier in the register code but then the CXL code must create a
loop over the available memdevs to match the incoming CPER record.

By allowing each memdev to register their own callback they each get
called and match the CPER record to themselves.

>
> ...do the reverse on unregister and hold the rwsem for read while
> invoking to hold off unregistration while event processing is in flight.
>
> There are a couple properties of a blocking notifier chain that are
> unwanted: chaining, only the CXL subsystem cares about seeing these
> records,

True but there are multiple memdev driver instances which care. It is not
just 1 entity which cares about these.

> and loss of type-safety, no need to redirect through a (void *)
> payload compared to a direct call. Overall makes the implementation more
> explicit.

Let me see how it works out with your comments on the final patch. But
the additional chain state of the notifier made this much easier in my
head. IOW chain up any memdev which wants these notifiers.

>
>
> > diff --git a/drivers/firmware/efi/cper_cxl.h b/drivers/firmware/efi/cper_cxl.h
> > index 86bfcf7909ec..aa3d36493586 100644
> > --- a/drivers/firmware/efi/cper_cxl.h
> > +++ b/drivers/firmware/efi/cper_cxl.h
> > @@ -10,11 +10,38 @@
> > #ifndef LINUX_CPER_CXL_H
> > #define LINUX_CPER_CXL_H
> >
> > +#include <linux/cxl-event.h>
> > +
> > /* CXL Protocol Error Section */
> > #define CPER_SEC_CXL_PROT_ERR \

FYI this is not added code.

> > GUID_INIT(0x80B9EFB4, 0x52B5, 0x4DE3, 0xA7, 0x77, 0x68, 0x78, \
> > 0x4B, 0x77, 0x10, 0x48)
>
> I like these defines, I notice that mbox.c uses "static const"
> defintions for something similar. Perhaps unify on the #define method?

The static const's are defined such that they can be passed to the trace
code as a reference without the creation of a temp variable. These only
need to be used as static data.

> I
> think this define also wants a _GUID suffix to reduce potential
> confusion between the _UUID variant and the cxl_event_log_type
> definitions?

The UUID's are never defined as a macro. I also followed the current
convention here of prefixing 'CPER_SEC_' as per CPER_SEC_CXL_PROT_ERR.

>
> >
> > +/* CXL Event record UUIDs are formated at GUIDs and reported in section type */
> > +/*
> > + * General Media Event Record
> > + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> > + */
> > +#define CPER_SEC_CXL_GEN_MEDIA \
> > + GUID_INIT(0xfbcd0a77, 0xc260, 0x417f, \
> > + 0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6)
> > +

'CPER_SEC_' is in my mind different from an actual '*CPER_EVENT*'.

But I can see how the macro/enums are similar enough to have confused
you.

I can append _GUID if you really want.

Ira