Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command

From: Dan Williams
Date: Sat Dec 03 2022 - 16:34:10 EST


Ira Weiny wrote:
> On Thu, Dec 01, 2022 at 05:39:12PM -0800, Dan Williams wrote:
> > ira.weiny@ wrote:
> > > From: Ira Weiny <ira.weiny@xxxxxxxxx>
> > >
>
> [snip]
>
> > >
> > > +#define CREATE_TRACE_POINTS
> > > +#include <trace/events/cxl.h>
> > > +
> > > #include "core.h"
> > >
> > > static bool cxl_raw_allow_all;
> > > @@ -48,6 +51,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
> > > CXL_CMD(RAW, CXL_VARIABLE_PAYLOAD, CXL_VARIABLE_PAYLOAD, 0),
> > > #endif
> > > CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
> > > + CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
> > > CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
> > > CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
> > > CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
> >
> > Similar to this patch:
> >
> > https://lore.kernel.org/linux-cxl/166993221008.1995348.11651567302609703175.stgit@xxxxxxxxxxxxxxxxxxxxxxxxx/
> >
> > CXL_MEM_COMMAND_ID_GET_EVENT_RECORD, should be added to the "always
> > kernel" / cxlds->exclusive_cmds mask.
>
> Done for all the commands. I'll rebase as well before sending this out.
>
> >
> > > @@ -704,6 +708,106 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
> > > }
> > > EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
> > >
> > > +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> > > + enum cxl_event_log_type type)
> > > +{
> > > + struct cxl_get_event_payload *payload;
> > > + u16 nr_rec;
> > > +
> > > + mutex_lock(&cxlds->event_buf_lock);
> > > +
> > > + payload = cxlds->event_buf;
> > > +
> > > + do {
> > > + u8 log_type = type;
> > > + int rc;
> > > +
> > > + rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVENT_RECORD,
> > > + &log_type, sizeof(log_type),
> > > + payload, cxlds->payload_size);
> > > + if (rc) {
> > > + dev_err(cxlds->dev, "Event log '%s': Failed to query event records : %d",
> > > + cxl_event_log_type_str(type), rc);
> > > + goto unlock_buffer;
> > > + }
> > > +
> > > + nr_rec = le16_to_cpu(payload->record_count);
> > > + if (trace_cxl_generic_event_enabled()) {
> >
> > This feels like a premature micro-optimization as none of this code is
> > fast path and it is dwarfed by the cost of executing the mailbox
> > command. I started with trying to reduce the 80 column collision
> > pressure, but then stepped back even further and asked, why?
>
> Because Steven told me to. :-( I should have been smarter than that.

You did the right thing, I failed to jump in sooner on this set.

>
> >
> > > + int i;
> > > +
> > > + for (i = 0; i < nr_rec; i++)
> > > + trace_cxl_generic_event(dev_name(cxlds->dev),
> > > + type,
> > > + &payload->records[i]);
> >
> > As far as I can tell trace_cxl_generic_event() always expects a
> > device-name as its first argument. So why not enforce that with
> > type-safety? I.e. I think trace_cxl_generic_event() should take a
> > "struct device *", not a string unless it is really the case that any
> > old string will do as the first argument to the trace event. Otherwise
> > the trace point can do "__string(dev_name, dev_name(dev))", and mandate
> > that callers pass devices.
>
> From a trace point view 'any old string' will do. There was nothing else the
> trace needed from struct device so I skipped it.

I'd prefer more fine-grained type safety wherever possible.

>
> [snip]
>
> > > +
> > > +/**
> > > + * cxl_mem_get_event_records - Get Event Records from the device
> > > + * @cxlds: The device data for the operation
> > > + *
> > > + * Retrieve all event records available on the device and report them as trace
> > > + * events.
> > > + *
> > > + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> > > + */
> > > +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> > > +{
> > > + u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > > +
> > > + dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
> > > +
> > > + if (!cxlds->event_buf) {
> > > + cxlds->event_buf = alloc_event_buf(cxlds);
> > > + if (WARN_ON_ONCE(!cxlds->event_buf))
> > > + return;
> > > + }
> >
> > What's the point of having an event_buf_lock if event_buf is reallocated
> > every call?
>
> This is only called on start up.

cxl_mem_get_event_records() is called all the time. The place to
allocate buffers attached to 'struct cxl_dev_state' at start up is
cxl_dev_state_create(), or sometime after cxl_enumerate_cmds() if you
want to wait and see if the device supports events and CXL _OSC says the
driver can drive events.

> > Just allocate event_buf once at the beginning of time during init if the
> > device supports event log retrieval, and fail the driver load if that
> > allocation fails. No runtime WARN() for memory allocation.
>
> It was. I'll make that more clear in the next series.
>
> >
> > I notice this patch does not clear events, I trust that comes later in
> > the series, but I think it belongs here to make this patch a complete
> > standalone thought.
>
> Squashed. But it does make for a large patch. Which I'm not a fan of for
> review. Lucky that now we have a lot of review on the parts.
>
> >
> > > + if (status & CXLDEV_EVENT_STATUS_INFO)
> > > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> > > + if (status & CXLDEV_EVENT_STATUS_WARN)
> > > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> > > + if (status & CXLDEV_EVENT_STATUS_FAIL)
> > > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > > + if (status & CXLDEV_EVENT_STATUS_FATAL)
> > > + cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> >
> > This retrieval order should be flipped. If there is a FATAL pending I
> > expect a monitor wants that first and would be happy to parse the INFO
> > later. I would go so far as to say that if the INFO logger is looping
> > and a FATAL comes in the driver should get that out first before going
> > back for more INFO logs. That would mean executing Clear Events and
> > looping through the logs by priority until all the status bits fall
> > silent inside cxl_mem_get_records_log().
>
> I'll flip them. And determine if this is really what we want to do for the
> irq.
>
> The issue with the irq handling calling a single function which checks all
> status is that we may end up with some odd interrupts doing nothing depending
> on racing etc.

If an event handler wakes and reads 0-status bits because another
handler did it then return IRQ_HANDLED. You'll have this problem whether
you have a central function or not, because there's only one status
register for multiple sources.