Re: [PATCH V11 5/8] cxl/port: Read CDAT table

From: Ira Weiny
Date: Tue Jun 21 2022 - 17:24:07 EST


On Fri, Jun 17, 2022 at 05:43:34PM -0700, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Ira Weiny <ira.weiny@xxxxxxxxx>
> >

[snip]

> > +
> > +/**
> > + * read_cdat_data - Read the CDAT data on this port
> > + * @port: Port to read data from
> > + *
> > + * This call will sleep waiting for responses from the DOE mailbox.
> > + */
> > +void read_cdat_data(struct cxl_port *port)
> > +{
> > + static struct pci_doe_mb *cdat_mb;
> > + struct device *dev = &port->dev;
> > + struct device *uport = port->uport;
> > + size_t cdat_length;
> > + int ret;
> > +
> > + /*
> > + * Ensure a reference on the underlying uport device which has the
> > + * mailboxes in it
> > + */
> > + get_device(uport);
>
> I had written up a long description grumbling about "just in case"
> acquistions of device references, but then I lost that draft.
>
> So I'll do the shorter response, but give you more homework in the
> process. How / Why is that get_device() needed? What are the guarantees that
> ensure you that the last reference has not been dropped just before that
> call? Hint: what context is this code running?

I'll check it out. I suspect there is some reference on uport already taken
such that if port is valid uport must be valid.

>
> > +
> > + cdat_mb = find_cdat_mb(uport);
> > + if (!cdat_mb) {
> > + dev_dbg(dev, "No CDAT mailbox\n");
> > + goto out;
> > + }
> > +
> > + if (cxl_cdat_get_length(dev, cdat_mb, &cdat_length)) {
> > + dev_dbg(dev, "No CDAT length\n");
> > + goto out;
> > + }
> > +
> > + port->cdat.table = devm_kzalloc(dev, cdat_length, GFP_KERNEL);
> > + if (!port->cdat.table)
> > + goto out;
> > +
> > + port->cdat.length = cdat_length;
> > + ret = cxl_cdat_read_table(dev, cdat_mb, &port->cdat);
> > + if (ret) {
> > + /* Don't leave table data allocated on error */
> > + devm_kfree(dev, port->cdat.table);
> > + port->cdat.table = NULL;
> > + port->cdat.length = 0;
> > + dev_err(dev, "CDAT data read error\n");
>
> Rather than a chatty / ephemeral error message I think this wants some
> indication in userspace, likely the 0-length CDAT binary attribute, so
> that userspace can debug why the kernel is picking sub-optimal QTG ids
> for newly provisioned CXL regions.

I thought we agreed that 0-length or CDAT query failure would result in no
sysfs entry?

This message was to alert that a CDAT query was attempted but the read failed
vs finding no mailbox with CDAT capabilities for example.

[snip]

> >
> > +static ssize_t cdat_read(struct file *filp, struct kobject *kobj,
> > + struct bin_attribute *bin_attr, char *buf,
> > + loff_t offset, size_t count)
> > +{
> > + struct device *dev = kobj_to_dev(kobj);
> > + struct cxl_port *port = to_cxl_port(dev);
> > +
> > + if (!port->cdat.table)
> > + return 0;
> > +
> > + return memory_read_from_buffer(buf, count, &offset,
> > + port->cdat.table,
> > + port->cdat.length);
> > +}
> > +
> > +static BIN_ATTR_RO(cdat, 0);
>
> This should be BIN_ATTR_ADMIN_RO(), see:
>
> 3022c6a1b4b7 driver-core: Introduce DEVICE_ATTR_ADMIN_{RO,RW}

Are you suggesting I add BIN_ATTR_ADMIN_* macros?

>
> > +
> > +static umode_t cxl_port_bin_attr_is_visible(struct kobject *kobj,
> > + struct bin_attribute *attr, int i)
> > +{
> > + struct device *dev = kobj_to_dev(kobj);
> > + struct cxl_port *port = to_cxl_port(dev);
> > +
> > + if ((attr == &bin_attr_cdat) && port->cdat.table)
> > + return 0400;
>
> Per above change you only need to manage visibility and not permissions,

But the permissions indicate visibility (In the kdoc for struct
attribute_group).


* ... Must
* return 0 if a binary attribute is not visible. The returned
* value will replace static permissions defined in
* struct bin_attribute.

And the value returned overrides the mode.

fs/sysfs/group.c:

create_files()

82 if (grp->is_bin_visible) {
83 mode = grp->is_bin_visible(kobj, *bin_attr, i);
84 if (!mode)
85 continue;
86 }
87
88 WARN(mode & ~(SYSFS_PREALLOC | 0664),
89 "Attribute %s: Invalid permissions 0%o\n",
90 (*bin_attr)->attr.name, mode);
91
92 mode &= SYSFS_PREALLOC | 0664;


So I'm willing to add the macro but I'm not sure it is going to change anything
in this case. I think to make those _ADMIN_ macros work with is_visible()
create_files() needs to be changed. :-/ I'm not sure if the addition of
DEVICE_ATTR_ADMIN_{RO,RW} intended for is_visible() to be able to override the
mode?

> also per the comment about the error message in the CDAT table read case
> I think it makes sense to show an empty attribute. Only if the device
> does not claim to have a CDAT should the attribute not be visible.

Ok I can do that.

Thanks for the review,
Ira

>
> > +
> > + return 0;
> > +}
> > +
> > +static struct bin_attribute *cxl_cdat_bin_attributes[] = {
> > + &bin_attr_cdat,
> > + NULL,
> > +};
> > +
> > +static struct attribute_group cxl_cdat_attribute_group = {
> > + .name = "CDAT",
> > + .bin_attrs = cxl_cdat_bin_attributes,
> > + .is_bin_visible = cxl_port_bin_attr_is_visible,
> > +};
> > +
> > +static const struct attribute_group *cxl_port_attribute_groups[] = {
> > + &cxl_cdat_attribute_group,
> > + NULL,
> > +};
> > +
> > static struct cxl_driver cxl_port_driver = {
> > .name = "cxl_port",
> > .probe = cxl_port_probe,
> > .id = CXL_DEVICE_PORT,
> > + .drv = {
> > + .dev_groups = cxl_port_attribute_groups,
> > + },
> > };
> >
> > module_cxl_driver(cxl_port_driver);
> > --
> > 2.35.1
> >
>
>