Re: [PATCH] driver/perf: Add PMU driver for the ARM DMC-620 memory controller.

From: Mark Rutland
Date: Fri Mar 20 2020 - 07:25:48 EST


On Thu, Mar 19, 2020 at 12:03:43PM -0700, Tuan Phan wrote:
> Hi Mark,
> Please find my comments below.

Hi Tuan,

As Will said, *please* set up a more usual mail clinet configuration if
you can. The reply style (with lines starting with '=>') is unusual and
very painful to spot.

> > On Mar 19, 2020, at 8:16 AM, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> >
> > Hi Tuan,
> >
> > On Tue, Mar 17, 2020 at 05:29:38PM -0700, Tuan Phan wrote:
> >> DMC-620 PMU supports total 10 counters which each is
> >> independently programmable to different events and can
> >> be started and stopped individually.
> >
> > Looking at the TRM for DMC-620, the PMU registers are not in a separate
> > frame from the other DMC control registers, and start at offset 0xA00
> > (AKA 2560). I would generally expect that access to the DMC control
> > registers was restricted to the secure world; is that not the case on
> > your platform?
> >
> > I ask because if those are not restricted, the Normal world could
> > potentially undermine the Secure world through this (e.g. playing with
> > training settings, messing with the physical memory map, injecting RAS
> > errors). Have you considered this?
> => Only PMU registers can be accessed within normal world. I only pass
> PMU registers (offset 0xA00) to kernel so shouldnât be problem.

Sure, you only *describe* that in the ACPI tables, but I can't see how
that's access control is enforced in the hardware, because these
registers fall in the same 4K page as other control registers, and
AFAICT the IP doesn't distinguish S/NS accesses.

If the Secure world on your part uses DRAM (including the secure
portions of IPs like SMMUs), you *must* ensure that the Normal world
cannot access those other control registers, or it can corrupt Secure
world state and escalate privilege.

Is that not a concern here?

> >> DMC-620 PMU devices are named as arm_dmc620_<uid> where
> >> <uid> is the UID of DMC-620 PMU ACPI node. Currently, it only
> >> supports ACPI. Other platforms feel free to test and add
> >> support for device tree.
> >>
> >> Usage example:
> >> #perf stat -e arm_dmc620_0/clk_cycle_count/ -C 0
> >> Get perf event for clk_cycle_count counter.
> >>
> >> #perf stat -e arm_dmc620_0/clkdiv2_allocate,mask=0x1f,match=0x2f,
> >> incr=2,invert=1/ -C 0
> >> The above example shows how to specify mask, match, incr,
> >> invert parameters for clkdiv2_allocate event.
> >
> > [...]
> >
> >> +#define DMC620_CNT_MAX_PERIOD 0xffffffff
> >> +#define DMC620_PMU_CLKDIV2_MAX_COUNTERS 8
> >> +#define DMC620_PMU_CLK_MAX_COUNTERS 2
> >> +#define DMC620_PMU_MAX_COUNTERS \
> >> + (DMC620_PMU_CLKDIV2_MAX_COUNTERS + DMC620_PMU_CLK_MAX_COUNTERS)
> >> +
> >> +#define DMC620_PMU_OVERFLOW_STATUS_CLKDIV2_OFFSET 8
> >
> > This appears to be relative to 0xA00. What exactly does your ACPI
> > description provide? The whole set of DMC registers, or just the PMU
> > registers?
> => Just PMU registers from 0xA00 to 0xBFF.

I don't believe that is correct; see below w.r.t. the ACPI binding.

> >> +static int arm_dmc620_pmu_dev_init(struct arm_dmc620_pmu *dmc620_pmu)
> >> +{
> >> + struct platform_device *pdev = dmc620_pmu->pdev;
> >> + int ret;
> >> +
> >> + ret = devm_request_irq(&pdev->dev, dmc620_pmu->irq,
> >> + arm_dmc620_pmu_handle_irq,
> >> + IRQF_SHARED,
> >> + dev_name(&pdev->dev), dmc620_pmu);
> >
> > This should have IRQF_NOBALANCING | IRQF_NO_THREAD. I don't think we
> > should have IRQF_SHARED.
> => I agree on having IRQF_NOBALANCING and IRQF_NO_THREAD. But
> IRQF_SHARED is needed. In our platform all DMC620s share same IRQs and
> any cpus can access the pmu registers.

Linux needs to ensure that the same instance is concistently accessed
from the same CPU, and needs to migrate the IRQ to handle that. Given we
do that on a per-instance basis, we cannot share the IRQ with another
instance.

Please feed back to you HW designers that muxing IRQs like this causes
significant problems for software.

> >
> > [...]
> >
> >> +static const struct acpi_device_id arm_dmc620_acpi_match[] = {
> >> + { "ARMHD620", 0},
> >> + {},
> >> +};
> >
> > Just to check, was this ID allocated by Arm, or have you allocated it?
> => ID was allocated by ARM. Please refer to
> https://static.docs.arm.com/den0093/a/DEN0093_ACPI_Arm_Components_1.0.pdf <https://static.docs.arm.com/den0093/a/DEN0093_ACPI_Arm_Components_1.0.pdf>

Thanks for the link! For this _HID, the document says the _CRS contains
a QWordMemory Base address, which the full description is:

| Base Address of the DMC620 in the system address map

... which would presumably mean the *entire* set of MMIO registers, not
just the PMU portion. The example firmly hints that it is the entire set
of MMIO registers:

| In this example, the DMC620 register space is mapped to a 64K range
| that begins at offset 0x80CAFE0000 in the system address space

Thanks,
Mark.