Re: [RFC PATCH v2 2/2] regulator: core: Avoid lockdep reports when resolving supplies

From: Stephen Boyd
Date: Fri Apr 07 2023 - 17:46:28 EST


Quoting Douglas Anderson (2023-03-29 14:33:54)
> diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
> index 9a13240f3084..08726bc0da9d 100644
> --- a/drivers/regulator/core.c
> +++ b/drivers/regulator/core.c
> @@ -207,6 +207,78 @@ static void regulator_unlock(struct regulator_dev *rdev)
> mutex_unlock(&regulator_nesting_mutex);
> }
>
> +/**
> + * regulator_lock_two - lock two regulators
> + * @rdev1: first regulator
> + * @rdev2: second regulator
> + * @ww_ctx: w/w mutex acquire context
> + *
> + * Locks both rdevs using the regulator_ww_class.
> + */
> +static void regulator_lock_two(struct regulator_dev *rdev1,
> + struct regulator_dev *rdev2,
> + struct ww_acquire_ctx *ww_ctx)
> +{
> + struct regulator_dev *tmp;
> + int ret;
> +
> + ww_acquire_init(ww_ctx, &regulator_ww_class);
> +
> + /* Try to just grab both of them */
> + ret = regulator_lock_nested(rdev1, ww_ctx);
> + WARN_ON(ret);
> + ret = regulator_lock_nested(rdev2, ww_ctx);
> + if (ret != -EDEADLOCK) {
> + WARN_ON(ret);
> + goto exit;
> + }

I think this would be clearer if we had two local variable pointers

struct regulator_dev *held, *contended;

held = rdev1;
contended = rdev2;

> +
> + while (true) {
> + /*
> + * Start of loop: rdev1 was locked and rdev2 was contended.
> + * Need to unlock rdev1, slowly lock rdev2, then try rdev1
> + * again.
> + */
> + regulator_unlock(rdev1);

regulator_unlock(held);

> +
> + ww_mutex_lock_slow(&rdev2->mutex, ww_ctx);
> + rdev2->ref_cnt++;
> + rdev2->mutex_owner = current;
> + ret = regulator_lock_nested(rdev1, ww_ctx);

ww_mutex_lock_slow(&contended->mutex, ww_ctx);
contended->ref_cnt++;
contended->mutex_owner = current;
swap(held, contended);
ret = regulator_lock_nested(contended, ww_ctx);
if (ret != -EDEADLOCK) {

> + WARN_ON(ret);
> + break;
> + }
> + }
> +
> +exit:
> + ww_acquire_done(ww_ctx);
> +}
> +
> @@ -1627,8 +1699,8 @@ static int set_machine_constraints(struct regulator_dev *rdev)
>
> /**
> * set_supply - set regulator supply regulator
> - * @rdev: regulator name

It certainly wasn't the name :)

> - * @supply_rdev: supply regulator name
> + * @rdev: regulator (locked)
> + * @supply_rdev: supply regulator (locked))
> *
> * Called by platform initialisation code to set the supply regulator for this
> * regulator. This ensures that a regulators supply will also be enabled by the
[...]
> @@ -2190,7 +2263,9 @@ struct regulator *_regulator_get(struct device *dev, const char *id,
> return regulator;
> }
>
> + regulator_lock(rdev);
> regulator = create_regulator(rdev, dev, id);
> + regulator_unlock(rdev);

I'm sad that we're now locking the entire time create_regulator() is
called. Can that be avoided? I see that create_regulator() publishes the
consumer on the consumer_list, but otherwise I don't think it needs to
hold the regulator lock. It goes on to call debugfs code after
allocating memory. After this patch, we're going to be holding the lock
for that regulator across debugfs APIs. I suspect that may lead to more
problems later on because the time we hold the lock is extremely wide
now.

Of course, we were already holding the child regulator's lock for the
supply, because that's what this patch is fixing in
regulator_resolve_supply(). I'm just nervous that we're holding the lock
for a much wider time now. Maybe we can have create_regulator() return
the regulator and add a new function like add_regulator_consumer() that
does the list modification? Then we can make create_regulator() do
everything without holding a lock and have a very short time where the
new function locks two regulator locks and does the linkage.