Re: [PATCH 5/5] thunderbolt: Add support for runtime PM

From: Mika Westerberg
Date: Sat Jul 07 2018 - 10:26:03 EST


On Sat, Jul 07, 2018 at 03:38:15PM +0200, Lukas Wunner wrote:
> On Mon, Jun 18, 2018 at 02:07:31PM +0300, Mika Westerberg wrote:
> > --- a/drivers/thunderbolt/domain.c
> > +++ b/drivers/thunderbolt/domain.c
> > @@ -132,6 +133,8 @@ static ssize_t boot_acl_show(struct device *dev, struct device_attribute *attr,
> > if (!uuids)
> > return -ENOMEM;
> >
> > + pm_runtime_get_sync(&tb->dev);
> > +
> > if (mutex_lock_interruptible(&tb->lock)) {
> > ret = -ERESTARTSYS;
> > goto out;
> [snip]
> > @@ -426,6 +437,13 @@ int tb_domain_add(struct tb *tb)
> > /* This starts event processing */
> > mutex_unlock(&tb->lock);
> >
> > + pm_runtime_no_callbacks(&tb->dev);
> > + pm_runtime_set_active(&tb->dev);
> > + pm_runtime_enable(&tb->dev);
> > + pm_runtime_set_autosuspend_delay(&tb->dev, TB_AUTOSUSPEND_DELAY);
> > + pm_runtime_mark_last_busy(&tb->dev);
> > + pm_runtime_use_autosuspend(&tb->dev);
> > +
> > return 0;
> >
> > err_domain_del:
>
> You're setting pm_runtime_no_callbacks() on the domain. A side effect of
> setting this flag is that whenever the domain's device is runtime resumed,
> it's parent (the NHI) is *not* runtime resumed, see this comment in
> rpm_resume():
>
> /*
> * See if we can skip waking up the parent. This is safe only if
> * power.no_callbacks is set, because otherwise we don't know whether
> * the resume will actually succeed.
> */
>
> Above, you're runtime resuming the domain in boot_acl_show(). So if the
> NHI is runtime suspended while that sysfs attribute is accessed, it won't
> be runtime resumed. Is that actually what you want?

No, it should be runtime resumed when domain is. Looking at the code in
question bit more deeper:

/*
* See if we can skip waking up the parent. This is safe only if
* power.no_callbacks is set, because otherwise we don't know whether
* the resume will actually succeed.
*/
if (dev->power.no_callbacks && !parent && dev->parent) {
spin_lock_nested(&dev->parent->power.lock, SINGLE_DEPTH_NESTING);
if (dev->parent->power.disable_depth > 0
|| dev->parent->power.ignore_children
|| dev->parent->power.runtime_status == RPM_ACTIVE) {
atomic_inc(&dev->parent->power.child_count);
spin_unlock(&dev->parent->power.lock);
retval = 1;
goto no_callback; /* Assume success. */
}
spin_unlock(&dev->parent->power.lock);
}

So skipping waking the parent can only happen if any of the following
conditions are true:

- Parent has runtime PM disabled
- Parent has ignore_children set
- Parent is already resumed

As far I can tell there can't be situation you describe that the parent would
not be runtime resumed when the domain is.

> > @@ -514,6 +532,28 @@ void tb_domain_complete(struct tb *tb)
> > tb->cm_ops->complete(tb);
> > }
> >
> > +int tb_domain_runtime_suspend(struct tb *tb)
> > +{
> > + if (tb->cm_ops->runtime_suspend) {
> > + int ret = tb->cm_ops->runtime_suspend(tb);
> > + if (ret)
> > + return ret;
> > + }
> > + tb_ctl_stop(tb->ctl);
> > + return 0;
> > +}
> > +
> > +int tb_domain_runtime_resume(struct tb *tb)
> > +{
> > + tb_ctl_start(tb->ctl);
> > + if (tb->cm_ops->runtime_resume) {
> > + int ret = tb->cm_ops->runtime_resume(tb);
> > + if (ret)
> > + return ret;
> > + }
> > + return 0;
> > +}
> > +
> > /**
> > * tb_domain_approve_switch() - Approve switch
> > * @tb: Domain the switch belongs to
> > --- a/drivers/thunderbolt/nhi.c
> > +++ b/drivers/thunderbolt/nhi.c
> > @@ -900,7 +900,32 @@ static void nhi_complete(struct device *dev)
> > struct pci_dev *pdev = to_pci_dev(dev);
> > struct tb *tb = pci_get_drvdata(pdev);
> >
> > - tb_domain_complete(tb);
> > + /*
> > + * If we were runtime suspended when system suspend started,
> > + * schedule runtime resume now. It should bring the domain back
> > + * to functional state.
> > + */
> > + if (pm_runtime_suspended(&pdev->dev))
> > + pm_runtime_resume(&pdev->dev);
> > + else
> > + tb_domain_complete(tb);
> > +}
> > +
> > +static int nhi_runtime_suspend(struct device *dev)
> > +{
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > + struct tb *tb = pci_get_drvdata(pdev);
> > +
> > + return tb_domain_runtime_suspend(tb);
> > +}
> > +
> > +static int nhi_runtime_resume(struct device *dev)
> > +{
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > + struct tb *tb = pci_get_drvdata(pdev);
> > +
> > + nhi_enable_int_throttling(tb->nhi);
> > + return tb_domain_runtime_resume(tb);
> > }
>
> You're invoking tb_domain_runtime_suspend() from nhi_runtime_suspend(),
> same for ->runtime_resume.
>
> Wouldn't it make more sense to make tb_domain_runtime_suspend() the
> ->runtime_suspend callback of the domain instead of mixing it together
> with NHI runtime suspend?

You mean let the PM core to handle this for domain? Maybe but currently we do
the same for other callbacks as well so this just follows that.

> BTW, what's the purpose of nhi_enable_int_throttling()?

It changes how fast interrupts get delivered and when to start throttling.
Mostly needed in P2P functionality (but should not do any harm for control
channel traffic). See also 8c6bba10fb92 ("thunderbolt: Configure interrupt
throttling for all interrupts").

> > --- a/drivers/thunderbolt/switch.c
> > +++ b/drivers/thunderbolt/switch.c
> > +/*
> > + * Currently only need to provide the callbacks. Everything else is handled
> > + * in the connection manager.
> > + */
> > +static int __maybe_unused tb_switch_runtime_suspend(struct device *dev)
> > +{
> > + return 0;
> > +}
> > +
> > +static int __maybe_unused tb_switch_runtime_resume(struct device *dev)
> > +{
> > + return 0;
> > +}
> > +
> > +static const struct dev_pm_ops tb_switch_pm_ops = {
> > + SET_RUNTIME_PM_OPS(tb_switch_runtime_suspend, tb_switch_runtime_resume,
> > + NULL)
> > +};
> > +
> > struct device_type tb_switch_type = {
> > .name = "thunderbolt_device",
> > .release = tb_switch_release,
> > + .pm = &tb_switch_pm_ops,
> > };
>
> Looking at the call sites of RPM_GET_CALLBACK(), I'm under the impression
> that if no callbacks are defined, the PM core will simply assume success.
> Then you don't need to define any PM callbacks for tb_switch. Am I missing
> something?

If you don't define them, RPM_GET_CALLBACK() returns NULL and subsequent call
to rpm_callback(NULL, dev) then returns -ENOSYS which is failure.