Re: [PATCH] pmdomain: mediatek: fix race conditions with genpd

From: Ulf Hansson
Date: Tue Jan 23 2024 - 07:23:00 EST


On Mon, 25 Dec 2023 at 14:36, Eugen Hristev <eugen.hristev@xxxxxxxxxxxxx> wrote:
>
> If the power domains are registered first with genpd and *after that*
> the driver attempts to power them on in the probe sequence, then it is
> possible that a race condition occurs if genpd tries to power them on
> in the same time.
> The same is valid for powering them off before unregistering them
> from genpd.

Right. When the PM domain has been registered with genpd, attempts to
power-on/off the PM domain need to be synchronized with genpd.

> Attempt to fix race conditions by first removing the domains from genpd
> and *after that* powering down domains.
> Also first power up the domains and *after that* register them
> to genpd.

This seems like a reasonable approach to me.

>
> Fixes: 59b644b01cf4 ("soc: mediatek: Add MediaTek SCPSYS power domains")
> Signed-off-by: Eugen Hristev <eugen.hristev@xxxxxxxxxxxxx>

Applied for fixes and by adding a stable tag, thanks! Although,
please, see some more comments below.

> ---
>
> This comes as another way to fix the problem as described in this thread:
> https://lore.kernel.org/linux-arm-kernel/20231129113120.4907-1-eugen.hristev@xxxxxxxxxxxxx/
>
> I have not been able to reproduce the problem with either fix anymore
> (so far).
>
> I have a few doubts about this one though, if I really covered the
> way it's supposed to work, and registering the pmdomains in the recursive
> function in the reversed order has any side effect or if it does not
> work correctly.
> Tested on mt8186 where it appears to be fine.

I had a quick look at the code in the driver and a few things caught my eyes.

*) The error path in scpsys_probe() doesn't seem to handle removal of
the link between parent/child-domains (subdomains).
**) An option that might simplify the code and error path too, could
be to convert into using of_genpd_add|remove_subdomain() in favor or
pm_genpd_add|remove_subdomain().

Kind regards
Uffe


>
> Eugen
>
> drivers/pmdomain/mediatek/mtk-pm-domains.c | 15 +++++++--------
> 1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/pmdomain/mediatek/mtk-pm-domains.c b/drivers/pmdomain/mediatek/mtk-pm-domains.c
> index e26dc17d07ad..e274e3315fe7 100644
> --- a/drivers/pmdomain/mediatek/mtk-pm-domains.c
> +++ b/drivers/pmdomain/mediatek/mtk-pm-domains.c
> @@ -561,6 +561,11 @@ static int scpsys_add_subdomain(struct scpsys *scpsys, struct device_node *paren
> goto err_put_node;
> }
>
> + /* recursive call to add all subdomains */
> + ret = scpsys_add_subdomain(scpsys, child);
> + if (ret)
> + goto err_put_node;
> +
> ret = pm_genpd_add_subdomain(parent_pd, child_pd);
> if (ret) {
> dev_err(scpsys->dev, "failed to add %s subdomain to parent %s\n",
> @@ -570,11 +575,6 @@ static int scpsys_add_subdomain(struct scpsys *scpsys, struct device_node *paren
> dev_dbg(scpsys->dev, "%s add subdomain: %s\n", parent_pd->name,
> child_pd->name);
> }
> -
> - /* recursive call to add all subdomains */
> - ret = scpsys_add_subdomain(scpsys, child);
> - if (ret)
> - goto err_put_node;
> }
>
> return 0;
> @@ -588,9 +588,6 @@ static void scpsys_remove_one_domain(struct scpsys_domain *pd)
> {
> int ret;
>
> - if (scpsys_domain_is_on(pd))
> - scpsys_power_off(&pd->genpd);
> -
> /*
> * We're in the error cleanup already, so we only complain,
> * but won't emit another error on top of the original one.
> @@ -600,6 +597,8 @@ static void scpsys_remove_one_domain(struct scpsys_domain *pd)
> dev_err(pd->scpsys->dev,
> "failed to remove domain '%s' : %d - state may be inconsistent\n",
> pd->genpd.name, ret);
> + if (scpsys_domain_is_on(pd))
> + scpsys_power_off(&pd->genpd);
>
> clk_bulk_put(pd->num_clks, pd->clks);
> clk_bulk_put(pd->num_subsys_clks, pd->subsys_clks);
> --
> 2.34.1
>
>