Re: [PATCH V4 1/4] arm_pmu: acpi: Refactor arm_spe_acpi_register_device()

From: Anshuman Khandual
Date: Fri Aug 11 2023 - 01:03:40 EST




On 8/9/23 18:24, Suzuki K Poulose wrote:
> On 08/08/2023 14:16, Will Deacon wrote:
>> On Tue, Aug 08, 2023 at 09:48:16AM +0100, Suzuki K Poulose wrote:
>>> On 08/08/2023 09:22, Anshuman Khandual wrote:
>>>> Sanity checking all the GICC tables for same interrupt number, and ensuring
>>>> a homogeneous ACPI based machine, could be used for other platform devices
>>>> as well. Hence this refactors arm_spe_acpi_register_device() into a common
>>>> helper arm_acpi_register_pmu_device().
>>>>
>>>> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
>>>> Cc: Will Deacon <will@xxxxxxxxxx>
>>>> Cc: Mark Rutland <mark.rutland@xxxxxxx>
>>>> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>>>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>>>> Co-developed-by: Will Deacon <will@xxxxxxxxxx>
>>>> Signed-off-by: Will Deacon <will@xxxxxxxxxx>
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@xxxxxxx>
>>>> ---
>>>>    drivers/perf/arm_pmu_acpi.c | 105 ++++++++++++++++++++++--------------
>>>>    1 file changed, 65 insertions(+), 40 deletions(-)
>>>>
>>>> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
>>>> index 90815ad762eb..72454bef2a70 100644
>>>> --- a/drivers/perf/arm_pmu_acpi.c
>>>> +++ b/drivers/perf/arm_pmu_acpi.c
>>>> @@ -69,6 +69,63 @@ static void arm_pmu_acpi_unregister_irq(int cpu)
>>>>            acpi_unregister_gsi(gsi);
>>>>    }
>>>> +static int __maybe_unused
>>>> +arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
>>>> +                 u16 (*parse_gsi)(struct acpi_madt_generic_interrupt *))
>>>> +{
>>>> +    int cpu, this_hetid, hetid, irq, ret;
>>>> +    u16 this_gsi, gsi = 0;
>>>> +
>>>> +    /*
>>>> +     * Ensure that platform device must have IORESOURCE_IRQ
>>>> +     * resource to hold gsi interrupt.
>>>> +     */
>>>> +    if (pdev->num_resources != 1)
>>>> +        return -ENXIO;
>>>> +
>>>> +    if (pdev->resource[0].flags != IORESOURCE_IRQ)
>>>> +        return -ENXIO;
>>>> +
>>>> +    /*
>>>> +     * Sanity check all the GICC tables for the same interrupt
>>>> +     * number. For now, only support homogeneous ACPI machines.
>>>> +     */
>>>> +    for_each_possible_cpu(cpu) {
>>>> +        struct acpi_madt_generic_interrupt *gicc;
>>>> +
>>>> +        gicc = acpi_cpu_get_madt_gicc(cpu);
>>>> +        if (gicc->header.length < len)
>>>> +            return gsi ? -ENXIO : 0;
>>>> +
>>>> +        this_gsi = parse_gsi(gicc);
>>>> +        if (!this_gsi)
>>>> +            return gsi ? -ENXIO : 0;
>>>> +
>>>> +        this_hetid = find_acpi_cpu_topology_hetero_id(cpu);
>>>> +        if (!gsi) {
>>>> +            hetid = this_hetid;
>>>> +            gsi = this_gsi;
>>>> +        } else if (hetid != this_hetid || gsi != this_gsi) {
>>>> +            pr_warn("ACPI: %s: must be homogeneous\n", pdev->name);
>>>> +            return -ENXIO;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH);
>>>> +    if (irq < 0) {
>>>> +        pr_warn("ACPI: %s Unable to register interrupt: %d\n", pdev->name, gsi);
>>>> +        return -ENXIO;
>>>> +    }
>>>> +
>>>> +    pdev->resource[0].start = irq;
>>>> +    ret = platform_device_register(pdev);
>>>> +    if (ret < 0) {
>>>> +        pr_warn("ACPI: %s: Unable to register device\n", pdev->name);
>>>> +        acpi_unregister_gsi(gsi);
>>>> +    }
>>>> +    return ret;
>>>
>>> A postivie return value here could confuse the caller. Also, with my comment
>>> below, we don't really need to return something from here.
>>
>> How does this return a positive value?
>
> Right now, there aren't. My point is this function returns a "return value" of another function. And the caller of this function doesn't
> really follow the "check" it needs.  e.g.:
>
> ret = foo();
> if (ret < 0)
>     error;
> return ret;
>
>
>
> And the caller only checks for
>
> if (ret)
>     error;
>
> This seems fragile.
>
>>
>>>> +    int ret = arm_acpi_register_pmu_device(&spe_dev, ACPI_MADT_GICC_SPE,
>>>> +                           arm_spe_parse_gsi);
>>>> +    if (ret)
>>>>            pr_warn("ACPI: SPE: Unable to register device\n");
>>>
>>> With this change, a system without SPE interrupt description always
>>> generates the above message. Is this intended ?
>>
>> If there are no irqs, why doesn't this return 0?
>
> Apologies, I missed that.
>
>> arm_acpi_register_pmu_device() should only fail if either:
>>
>>    - The static resources passed in are broken
>>    - The tables are not homogeneous
>>    - We fail to register the interrupt
>>
>> so something is amiss.
>
> Agreed. We don't need duplicate messages about an error ?
> i.e., one in arm_acpi_register_pmu_device() and another
> one in the caller ? (Of course adding any missing error msgs).
>
>
>>
>>> Could we not drop the above message as all the other possible error
>>> scenarios are reported. We could simply make the above helper void, see my
>>> comment above.
>>
>> I disagree. If the ACPI tables are borked, we should print a message saying
>> so.
>
> Ok, fair point.

I am confused, what's the conclusion, should we just leave this unchanged ?