[PATCH] pmdomain: mediatek: fix race conditions with genpd

From: Eugen Hristev
Date: Mon Dec 25 2023 - 08:36:40 EST


If the power domains are registered first with genpd and *after that*
the driver attempts to power them on in the probe sequence, then it is
possible that a race condition occurs if genpd tries to power them on
in the same time.
The same is valid for powering them off before unregistering them
from genpd.
Attempt to fix race conditions by first removing the domains from genpd
and *after that* powering down domains.
Also first power up the domains and *after that* register them
to genpd.

Fixes: 59b644b01cf4 ("soc: mediatek: Add MediaTek SCPSYS power domains")
Signed-off-by: Eugen Hristev <eugen.hristev@xxxxxxxxxxxxx>
---

This comes as another way to fix the problem as described in this thread:
https://lore.kernel.org/linux-arm-kernel/20231129113120.4907-1-eugen.hristev@xxxxxxxxxxxxx/

I have not been able to reproduce the problem with either fix anymore
(so far).

I have a few doubts about this one though, if I really covered the
way it's supposed to work, and registering the pmdomains in the recursive
function in the reversed order has any side effect or if it does not
work correctly.
Tested on mt8186 where it appears to be fine.

Eugen

drivers/pmdomain/mediatek/mtk-pm-domains.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/pmdomain/mediatek/mtk-pm-domains.c b/drivers/pmdomain/mediatek/mtk-pm-domains.c
index e26dc17d07ad..e274e3315fe7 100644
--- a/drivers/pmdomain/mediatek/mtk-pm-domains.c
+++ b/drivers/pmdomain/mediatek/mtk-pm-domains.c
@@ -561,6 +561,11 @@ static int scpsys_add_subdomain(struct scpsys *scpsys, struct device_node *paren
goto err_put_node;
}

+ /* recursive call to add all subdomains */
+ ret = scpsys_add_subdomain(scpsys, child);
+ if (ret)
+ goto err_put_node;
+
ret = pm_genpd_add_subdomain(parent_pd, child_pd);
if (ret) {
dev_err(scpsys->dev, "failed to add %s subdomain to parent %s\n",
@@ -570,11 +575,6 @@ static int scpsys_add_subdomain(struct scpsys *scpsys, struct device_node *paren
dev_dbg(scpsys->dev, "%s add subdomain: %s\n", parent_pd->name,
child_pd->name);
}
-
- /* recursive call to add all subdomains */
- ret = scpsys_add_subdomain(scpsys, child);
- if (ret)
- goto err_put_node;
}

return 0;
@@ -588,9 +588,6 @@ static void scpsys_remove_one_domain(struct scpsys_domain *pd)
{
int ret;

- if (scpsys_domain_is_on(pd))
- scpsys_power_off(&pd->genpd);
-
/*
* We're in the error cleanup already, so we only complain,
* but won't emit another error on top of the original one.
@@ -600,6 +597,8 @@ static void scpsys_remove_one_domain(struct scpsys_domain *pd)
dev_err(pd->scpsys->dev,
"failed to remove domain '%s' : %d - state may be inconsistent\n",
pd->genpd.name, ret);
+ if (scpsys_domain_is_on(pd))
+ scpsys_power_off(&pd->genpd);

clk_bulk_put(pd->num_clks, pd->clks);
clk_bulk_put(pd->num_subsys_clks, pd->subsys_clks);
--
2.34.1