Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism

From: Bjorn Helgaas
Date: Mon Jun 21 2021 - 09:01:43 EST


On Sat, Jun 19, 2021 at 07:29:20PM +0530, Amey Narkhede wrote:
> On 21/06/18 03:00PM, Bjorn Helgaas wrote:
> > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > Add reset_method sysfs attribute to enable user to
> > > query and set user preferred device reset methods and
> > > their ordering.

> > > + if (sysfs_streq(options, "default")) {
> > > + for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> > > + reset_methods[i] = reset_methods[i] ? prio-- : 0;
> > > + goto set_reset_methods;
> > > + }
> >
> > If you use pci_init_reset_methods() here, you can also get this case
> > out of the way early.
> >
> The problem with alternate encoding is we won't be able to know if
> one of the reset methods was disabled previously. For example,
>
> # cat reset_methods
> flr,bus # dev->reset_methods = [3, 5, 0, ...]
> # echo bus > reset_methods # dev->reset_methods = [5, 0, 0, ...]
> # cat reset_methods
> bus
>
> Now if an user wants to enable flr
>
> # echo flr > reset_methods # dev->reset_methods = [3, 0, 0, ...]
> OR
> # echo bus,flr > reset_methods # dev->reset_methods = [5, 3, 0, ...]
>
> either they need to write "default" first then flr or we will need to
> reprobe reset methods each time when user writes to reset_method attribute.

Not sure I completely understand the problem here. I think relying on
previous state that is invisible to the user is a little problematic
because it's hard for the user to predict what will happen.

If the user enables a method that was previously "disabled" because
the probe failed, won't the reset method itself just fail with
-ENOTTY? Is that a problem?

> > > + while ((name = strsep(&options, ",")) != NULL) {
> > > + if (sysfs_streq(name, ""))
> > > + continue;
> > > +
> > > + name = strim(name);
> > > +
> > > + for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > > + if (reset_methods[i] &&
> > > + sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> > > + reset_methods[i] = prio--;
> > > + break;
> > > + }
> > > + }
> > > +
> > > + if (i == PCI_RESET_METHODS_NUM) {
> > > + kfree(options);
> > > + return -EINVAL;
> > > + }
> > > + }
> > > +
> > > + if (reset_methods[0] &&
> > > + reset_methods[0] != PCI_RESET_METHODS_NUM)
> > > + pci_warn(pdev, "Device specific reset disabled/de-prioritized by user");
> >
> > Is there a specific reason for this warning? Is it just telling the
> > user that he might have shot himself in the foot? Not sure that's
> > necessary.
> >
> I think generally presence of device specific reset method means other
> methods are potentially broken. Is it okay to skip this?

We might want a warning at reset-time if all the methods failed,
because that means we may leak state between users. Maybe we also
want one here, if *all* reset methods are disabled. I don't really
like special treatment of device-specific methods here because it
depends on the assumption that "device-specific means all other resets
are broken." That's hard to maintain.

Bjorn