Re: [PATCH net-next 0/2] net: dsa: realtek: fix PHY register read corruption

From: Alvin Šipraga
Date: Thu Feb 17 2022 - 02:53:27 EST


Luiz Angelo Daros de Luca <luizluca@xxxxxxxxx> writes:

>> > I still feel like we are trying to go around a regmap limitation
>> > instead of fixing it there. If we control regmap lock (we can define a
>> > custom lock/unlock) and create new regmap_{read,write}_nolock
>> > variants, we'll just need to lock the regmap, do whatever you need,
>> > and unlock it.
>>
>> Can you show me what those regmap_{read,write}_nolock variants would
>> look like in your example? And what about the other regmap_ APIs we use,
>> like regmap_read_poll_timeout, regmap_update_bits, etc. - do you propose
>> to reimplement all of these?
>
> The option of having two regmaps is a nice way to have "_nolock"
> variants for free. It is much cleaner than any solutions I imagined!
> Ayway, I don't believe the regmap API expects to have an evil
> non-locked clone. It looks like it is being abused.
>
> What regmap API misses is a way to create a "transaction". Mdio, for
> example, expects the user to lock the bus before doing a series of
> accesses while regmap api assumes a single atomic access is enough.
> However, Realtek indirect register access shows that it is not enough.
> We could reimplement a mutex for every case where two calls might
> share the same register (or indirectly affect others like we saw with
> Realtek) but I believe a shared solution would be better, even if it
> costs a couple more wrap functions.
>
> It would be even nicer if we have a regmap "manual lock" mode that
> will expose the lock/unlock functions but it will never call them by
> itself. It would work if it could check if the caller is actually the
> same thread/context that locked it. However I doubt there is a clean
> solution in a kernel code that can check if the lock was acquired by
> the same context that is calling the read.

I went through all of this while preparing the patch, so your arguments
are familiar to me ;-)

What I sent was the cleanest solution I could eventually think of. I
don't think it is foul play, but I agree it is a bit funny to have this
kind of "shadow regmap". However, the interface is quite safe, and as I
implied in the commit message, quite foolproof as well.

Basically, rather than reimplementing every regmap API that I want to
use while manually taking the lock, I just use another regmap with
locking disabled. It boils down to exactly the same thing.

>
>
>> > BTW, I believe that, for realtek-mdio, a regmap custom lock mechanism
>> > could simply use mdio lock while realtek-smi already has priv->lock.
>>
>> Hmm OK. Actually I'm a bit confused about the mdio_lock: can you explain
>> what it's guarding against, for someone unfamiliar with MDIO? Currently
>> realtek-mdio's regmap has an additional lock around it (disable_locking
>> is 0), so with these patches applied the number of locks remains the
>> same.
>
> Today we already have to redundants locks (mdio and regmap). Your
> patch is just replacing the regmap lock.

Is that so? Andrew seems to imply that you shouldn't be using the
mdio_lock like this, but only for per-register access, and then
implement your own higher level lock:

> For PHYs this is sufficient. For switches, sometimes you need
> additional protection. The granularity of an access might not be a
> single register read or a write. It could be you need to read or write
> a few registers in an atomic way. If that is the case, you need a lock
> at a higher level.

It seems to me like you should have used mdiobus_{read,write} or even
mdiobus_{read,write}_nested? Although the state of the art in other DSA
drivers seems like a mixed bag, so I don't know.

Since I do not have an MDIO switch in front of me to test with, and
since the existing MDIO code looks a little suspect, again I would
prefer to stick in my lane and just fix the problem without
refactoring.

>
> regmap_read is something like this:
>
> regmap_read
> lock regmap
> realtek_mdio_read()
> lock mdio
> ...
> unlock mdio
> unlock regmap
>
> If you are implementing a custom lock, simply use mdio lock directly.
>
> And the map_nolock you created does not mean "access without locks"
> but "you must lock it yourself before using anything here". If that
> lock is actually mdio_lock, it would be ok to remove the lock inside
> realtek_mdio_{read,write}. You just need a reference to those
> lock/unlock functions in realtek_priv.
>
>> priv->lock is a spinlock which is inappropriate here. I'm not really
>> sure what the point of it is, besides to handle unlocked calls to the
>> _noack function. It might be removable altogether but I would prefer not
>> to touch it for this series.
>
> If spinlock is inappropriate, it can be easily converted to a mutex.
> Everything else from realtek-mdio might apply.

Well, this is a bugfix series, not a refactoring. I am not adding more
locks than were here before. If I start touching old code (this spinlock
predates my engagement with this driver), I will have to answer to that
in the commit message too. If we want to do this, let's do it after the
bugfix has been reviewed and merged. It will be easier to justify as
well.

Kind regards,
Alvin

>
>> Kind regards,
>> Alvin