Re: [PATCH v7 04/10] drivers: qcom: rpmh: add RPMH helper functions

From: Lina Iyer
Date: Fri May 11 2018 - 11:06:19 EST


Hi Doug,

On Thu, May 10 2018 at 16:37 -0600, Doug Anderson wrote:
Hi,

On Tue, May 8, 2018 at 9:05 AM, <ilina@xxxxxxxxxxxxxx> wrote:
On 2018-05-03 14:26, Doug Anderson wrote:
Hi Doug,


Hi,

On Wed, May 2, 2018 at 12:37 PM, Lina Iyer <ilina@xxxxxxxxxxxxxx> wrote:

+static struct rpmh_ctrlr rpmh_rsc[RPMH_MAX_CTRLR];
+static DEFINE_SPINLOCK(rpmh_rsc_lock);
+
+static struct rpmh_ctrlr *get_rpmh_ctrlr(const struct device *dev)
+{
+ int i;
+ struct rsc_drv *p, *drv = dev_get_drvdata(dev->parent);
+ struct rpmh_ctrlr *ctrlr = ERR_PTR(-EINVAL);
+ unsigned long flags;
+
+ if (!drv)
+ return ctrlr;
+
+ for (i = 0; i < RPMH_MAX_CTRLR; i++) {
+ if (rpmh_rsc[i].drv == drv) {
+ ctrlr = &rpmh_rsc[i];
+ return ctrlr;
+ }
+ }
+
+ spin_lock_irqsave(&rpmh_rsc_lock, flags);
+ list_for_each_entry(p, &rsc_drv_list, list) {
+ if (drv == p) {
+ for (i = 0; i < RPMH_MAX_CTRLR; i++) {
+ if (!rpmh_rsc[i].drv)
+ break;
+ }
+ if (i == RPMH_MAX_CTRLR) {
+ ctrlr = ERR_PTR(-ENOMEM);
+ break;
+ }
+ rpmh_rsc[i].drv = drv;
+ ctrlr = &rpmh_rsc[i];
+ break;
+ }
+ }
+ spin_unlock_irqrestore(&rpmh_rsc_lock, flags);


I may have missed something, but to me it appears that this whole
"rsc_drv_list" is pretty pointless. I wrote up a patch atop your
series to remove it at

<https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1042883/>
and it simplifies the code a whole bunch. From that patch, my
justification was:

The global rsc_drv_list was (as far as I can tell) racy and not useful
for anything.

I say it is racy because in general you need some sort of mutual
exclusion for lists. If someone is adding to a list while someone
else is iterating over it then you get badness.

I say it is not useful because the only user of it was
get_rpmh_ctrlr() and the only thing it did was to verify that the
"struct rsc_drv *" that it alrady had was in the list. How could it
not be?


Note that in v7 of your series you added a spinlock around your access
of "rsc_drv_list", but this doesn't actually remove the race.
Specifically I'm pretty sure that the list primitives don't support
calling list_add() while someone might be iterating over the list and
your spinlock isn't grabbed in rpmh_rsc_probe().

Note that I also say in my patch:

NOTE: After this patch get_rpmh_ctrlr() still seems a bit fishy. I'm
not sure why every caller would need its own private global cache of
stuff. ...but I left that part alone.


I am not sure I understand this.

As I've said I haven't reviewed RPMh in any amount of detail and so
perhaps I don't understand something.

OK, I dug a little more and coded up something for you. Basically
you're doing a whole bunch of iteration / extra work here to try to
come up with a way to associate an extra bit of data with each "struct
rsc_drv". Rather than that, just add an extra field into "struct
rsc_drv". Problem solved.

See http://crosreview.com/1054646 for what I mean.

I tried to avoid such pointer references and keep it object oriented
with this approach. I agree that we run through a list of 2 (at the max)
RSC to get the drv* from the rpmh_ctrlr. It is not going to be
expensive.

Another things this helps with is that, if the driver is not a child of
the RSC nodes in DT, then the drvdata of the parent would not a RSC node
and accessing that would result in a crash. This offers a cleaner exit
path for the error case.


I'll try to dig into this more so I could just be confused, but in
general it seems really odd to have a spinlock and something called a
"cache" at this level. If we need some sort of mutual exclusion or
caching it seems like it should be stored in memory directly
associated with the RPMh device, not some external global.

The idea behind the locking is not to avoid the race between rpmh.c and
rpmh-rsc.c. From the DT, the devices that are dependent on the RSCs are
probed following the probe of the controller. And init is not that we are
worried about.
The condition here is to prevent the rpmh_rsc[] from being modified
concurrently by drivers.

OK, I see the point of the locking now, but not the list still.
Sounds like Matthias agrees with me that the list isn't useful. Seems
like you should squash my patch at http://crosreview.com/1042883 into
yours.

I saw your approach. I am okay with it for your tree, my approach comes
out of experiences in qcom platforms and how things tend to shape up in
the future. I would want you to consider my reasoning as well, before we
go forward.

Thanks,
Lina