Re: [PATCH v9 3/4] drivers/perf: add DesignWare PCIe PMU driver

From: Shuai Xue
Date: Mon Oct 23 2023 - 05:39:45 EST




On 2023/10/23 10:05, Baolin Wang wrote:
>
>
> On 10/22/2023 3:47 PM, Shuai Xue wrote:
>> Hi, Baolin,
>>
>> I droped your Revivewed-by tag due to that I made significant changes to this
>> patch previously, please explicty give me Revivewed-by tag again if you are
>> happy with the changes.
>
> Yes, I am happy with this version (just some nits as below), and thanks for the review from other guys. Please feel free to add:
>
> Reviewed-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>


Thank you.

Best Regards,
Shuai

>
>> On 2023/10/20 21:42, Shuai Xue wrote:
>>> This commit adds the PCIe Performance Monitoring Unit (PMU) driver support
>>> for T-Head Yitian SoC chip. Yitian is based on the Synopsys PCI Express
>>> Core controller IP which provides statistics feature. The PMU is a PCIe
>>> configuration space register block provided by each PCIe Root Port in a
>>> Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error
>>> injection, and Statistics).
>>>
>>> To facilitate collection of statistics the controller provides the
>>> following two features for each Root Port:
>>>
>>> - one 64-bit counter for Time Based Analysis (RX/TX data throughput and
>>>    time spent in each low-power LTSSM state) and
>>> - one 32-bit counter for Event Counting (error and non-error events for
>>>    a specified lane)
>>>
>>> Note: There is no interrupt for counter overflow.
>>>
>>> This driver adds PMU devices for each PCIe Root Port. And the PMU device is
>>> named based the BDF of Root Port. For example,
>>>
>>>      30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
>>>
>>> the PMU device name for this Root Port is dwc_rootport_3018.
>>>
>>> Example usage of counting PCIe RX TLP data payload (Units of bytes)::
>>>
>>>      $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
>>>
>>> average RX bandwidth can be calculated like this:
>>>
>>>      PCIe TX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window
>>>
>>> Signed-off-by: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx>
>>> ---
>
> [snip]
>
>>> +static u64 dwc_pcie_pmu_read_time_based_counter(struct perf_event *event)
>>> +{
>>> +    struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu);
>>> +    struct pci_dev *pdev = pcie_pmu->pdev;
>>> +    int event_id = DWC_PCIE_EVENT_ID(event);
>>> +    u16 ras_des_offset = pcie_pmu->ras_des_offset;
>>> +    u32 lo, hi, ss;
>>> +
>>> +    /*
>>> +     * The 64-bit value of the data counter is spread across two
>>> +     * registers that are not synchronized. In order to read them
>>> +     * atomically, ensure that the high 32 bits match before and after
>>> +     * reading the low 32 bits.
>>> +     */
>>> +    pci_read_config_dword(pdev, ras_des_offset +
>>> +        DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH, &hi);
>>> +    do {
>>> +        /* snapshot the high 32 bits */
>>> +        ss = hi;
>>> +
>>> +        pci_read_config_dword(
>>> +            pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_LOW,
>>> +            &lo);
>>> +        pci_read_config_dword(
>>> +            pdev, ras_des_offset + DWC_PCIE_TIME_BASED_ANAL_DATA_REG_HIGH,
>>> +            &hi);
>>> +    } while (hi != ss);
>>> +
>>> +    /*
>>> +     * The Group#1 event measures the amount of data processed in 16-byte
>>> +     * units. Simplify the end-user interface by multiplying the counter
>>> +     * at the point of read.
>>> +     */
>>> +    if (event_id >= 0x20 && event_id <= 0x23)
>>> +        return (((u64)hi << 32) | lo) << 4;
>>> +    else
>
> You can drop the 'else'.

Agreed, will fix it in next version.


>>> +
>>> +    event->cpu = pcie_pmu->on_cpu;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static void dwc_pcie_pmu_set_period(struct hw_perf_event *hwc)
>>> +{
>>> +    local64_set(&hwc->prev_count, 0);
>>> +}
>
> Only dwc_pcie_pmu_event_start() will call this small function, why just remove this function and move local64_set() into dwc_pcie_pmu_event_start()?

Good suggestion, will fix it.