Re: [regression 4.20-rc1, bisected] cpufreq fails to register on db820c

From: Srinivas Kandagatla
Date: Tue Nov 06 2018 - 09:38:03 EST


Hi Niklas,

thanks for reporting this..

I found the issue, DT support seems to be totally broken in NVMEM.
Will send this fix once you can confirm it works for you!

Can you try this patch!
---------------------->cut<--------------------------

Author: Srinivas Kandagatla <srinivas.kandagatla@xxxxxxxxxx>
Date: Tue Nov 6 14:32:30 2018 +0000

nvmem: core: fix of_nvmem_cell_get()

NVMEM DT support seems to be totally broken after
e888d445ac33 ("nvmem: resolve cells from DT at registration time")
Fix this!

Fixes: e888d445ac33 ("nvmem: resolve cells from DT at registration time")
Reported-by: Niklas Cassel <niklas.cassel@xxxxxxxxxx>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@xxxxxxxxxx>

diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index 9b18ce90f907..7ae8072f56ae 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -44,6 +44,7 @@ struct nvmem_cell {
int bytes;
int bit_offset;
int nbits;
+ struct device_node *np;
struct nvmem_device *nvmem;
struct list_head node;
};
@@ -530,6 +531,7 @@ static int nvmem_add_cells_from_of(struct nvmem_device *nvmem)
return -ENOMEM;

cell->nvmem = nvmem;
+ cell->np = child;
cell->offset = be32_to_cpup(addr++);
cell->bytes = be32_to_cpup(addr);
cell->name = kasprintf(GFP_KERNEL, "%pOFn", child);
@@ -960,14 +962,13 @@ nvmem_cell_get_from_lookup(struct device *dev, const char *con_id)

#if IS_ENABLED(CONFIG_OF)
static struct nvmem_cell *
-nvmem_find_cell_by_index(struct nvmem_device *nvmem, int index)
+nvmem_find_cell_by_node(struct nvmem_device *nvmem, struct device_node *np)
{
struct nvmem_cell *cell = NULL;
- int i = 0;

mutex_lock(&nvmem_mutex);
list_for_each_entry(cell, &nvmem->cells, node) {
- if (index == i++)
+ if (np == cell->np)
break;
}
mutex_unlock(&nvmem_mutex);
@@ -1001,7 +1002,6 @@ struct nvmem_cell *of_nvmem_cell_get(struct device_node *np, const char *id)
cell_np = of_parse_phandle(np, "nvmem-cells", index);
if (!cell_np)
return ERR_PTR(-EINVAL);
-
nvmem_np = of_get_next_parent(cell_np);
if (!nvmem_np)
return ERR_PTR(-EINVAL);
@@ -1011,7 +1011,7 @@ struct nvmem_cell *of_nvmem_cell_get(struct device_node *np, const char *id)
if (IS_ERR(nvmem))
return ERR_CAST(nvmem);

- cell = nvmem_find_cell_by_index(nvmem, index);
+ cell = nvmem_find_cell_by_node(nvmem, cell_np);
if (!cell) {
__nvmem_device_put(nvmem);
return ERR_PTR(-ENOENT);

---------------------->cut<--------------------------
On 06/11/18 01:29, Niklas Cassel wrote:
Hello Bartosz,

I've run into a regression on v4.20-rc1, where cpufreq on dragonboard 820c
fails to register.

I've bisected the issue to:
e888d445ac33 ("nvmem: resolve cells from DT at registration time")

If I revert that patch, cpufreq starts working on dragonboard 820c again.


Before your patch:
[ 5.052985] qcom_cpufreq_kryo_probe:110 speedbin_nvmem: ffff8000d4d51e00
[ 5.053133] qcom_cpufreq_kryo_probe:120 speedbin: ffff8000d4d51d00
[ 5.058895] qcom_cpufreq_kryo_probe:121 *speedbin: 0


After your patch:
[ 4.791664] qcom_cpufreq_kryo_probe:110 speedbin_nvmem: ffff8000d472ed00
[ 4.791815] qcom_cpufreq_kryo_probe:120 speedbin: ffff8000d451b300
[ 4.797575] qcom_cpufreq_kryo_probe:121 *speedbin: 12


The prints are debug prints that I've added locally:

--- a/drivers/cpufreq/qcom-cpufreq-kryo.c
+++ b/drivers/cpufreq/qcom-cpufreq-kryo.c
@@ -107,6 +107,7 @@ static int qcom_cpufreq_kryo_probe(struct platform_device *pdev)
}
speedbin_nvmem = of_nvmem_cell_get(np, NULL);
+ pr_err("%s:%d speedbin_nvmem: %lx\n", __func__, __LINE__, speedbin_nvmem);
of_node_put(np);
if (IS_ERR(speedbin_nvmem)) {
if (PTR_ERR(speedbin_nvmem) != -EPROBE_DEFER)
@@ -116,6 +117,8 @@ static int qcom_cpufreq_kryo_probe(struct platform_device *pdev)
}
speedbin = nvmem_cell_read(speedbin_nvmem, &len);
+ pr_err("%s:%d speedbin: %lx\n", __func__, __LINE__, speedbin);
+ pr_err("%s:%d *speedbin: %u\n", __func__, __LINE__, (unsigned int)(*speedbin));
nvmem_cell_put(speedbin_nvmem);
if (IS_ERR(speedbin))
return PTR_ERR(speedbin);


So it appears that the read from nvmem_cell_read()
gets another value after your commit.

This value is then given to dev_pm_opp_set_supported_hw(),
however, since this new value (12) from nvmem_cell_read()
doesn't have any matching opp-supported-hw in device tree,
cpufreq fails to register.

This error only happens when we read this new incorrect value
(12 instead of 0).
Any idea why nvmem_cell_read() returns a new value for the
same efuse?


Kind regards,
Niklas