Re: [PATCH v9 3/4] drivers/perf: add DesignWare PCIe PMU driver

From: Shuai Xue
Date: Mon Oct 23 2023 - 08:26:14 EST




On 2023/10/23 17:13, Yicong Yang wrote:
> Hi Shuai,
>
> On 2023/10/20 21:42, Shuai Xue wrote:
>> This commit adds the PCIe Performance Monitoring Unit (PMU) driver support
>> for T-Head Yitian SoC chip. Yitian is based on the Synopsys PCI Express
>> Core controller IP which provides statistics feature. The PMU is a PCIe
>> configuration space register block provided by each PCIe Root Port in a
>> Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error
>> injection, and Statistics).
>>
>> To facilitate collection of statistics the controller provides the
>> following two features for each Root Port:
>>
>> - one 64-bit counter for Time Based Analysis (RX/TX data throughput and
>> time spent in each low-power LTSSM state) and
>> - one 32-bit counter for Event Counting (error and non-error events for
>> a specified lane)
>>
>> Note: There is no interrupt for counter overflow.
>>
>> This driver adds PMU devices for each PCIe Root Port. And the PMU device is
>> named based the BDF of Root Port. For example,
>>
>> 30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
>>
>> the PMU device name for this Root Port is dwc_rootport_3018.
>>
>> Example usage of counting PCIe RX TLP data payload (Units of bytes)::
>>
>> $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
>>
>> average RX bandwidth can be calculated like this:
>>
>> PCIe TX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window
>>
>> Signed-off-by: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx>
>
> Just one nit below. Otherwise looks good to me,
>
> Reviewed-by: Yicong Yang <yangyicong@xxxxxxxxxxxxx>


Hi, Yicong,

Time flies indeed. I am pleasantly surprised to realize that it has been
over a year since you sayed "Glad to see another PCIe PMU device!" in the
initial version. I want to express my deepest gratitude for the significant
time and effort you have invested in providing invaluable comments. Your
dedication has played a pivotal role in greatly enhancing the quality of
the code.

Thank you so much.

Best Regards,
Shuai

>> +
>> +static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
>> +{
>> + struct pci_dev *pdev = NULL;
>> + struct dwc_pcie_pmu *pcie_pmu;
>> + bool notify = false;
>> + char *name;
>> + u32 bdf;
>> + int ret;
>> +
>> + /* Match the rootport with VSEC_RAS_DES_ID, and register a PMU for it */
>> + for_each_pci_dev(pdev) {
>> + u16 vsec;
>> + u32 val;
>> +
>> + if (!(pci_is_pcie(pdev) &&
>> + pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT))
>> + continue;
>> +
>> + vsec = pci_find_vsec_capability(pdev, PCI_VENDOR_ID_ALIBABA,
>> + DWC_PCIE_VSEC_RAS_DES_ID);
>> + if (!vsec)
>> + continue;
>> +
>> + pci_read_config_dword(pdev, vsec + PCI_VNDR_HEADER, &val);
>> + if (PCI_VNDR_HEADER_REV(val) != 0x04)
>> + continue;
>> + pci_dbg(pdev,
>> + "Detected PCIe Vendor-Specific Extended Capability RAS DES\n");
>> +
>> + bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
>> + name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x",
>> + bdf);
>> + if (!name) {
>> + ret = -ENOMEM;
>> + goto out;
>> + }
>> +
>> + /* All checks passed, go go go */
>> + pcie_pmu = devm_kzalloc(&plat_dev->dev, sizeof(*pcie_pmu), GFP_KERNEL);
>> + if (!pcie_pmu) {
>> + ret = -ENOMEM;
>> + goto out;
>> + }
>> +
>> + pcie_pmu->pdev = pdev;
>> + pcie_pmu->ras_des_offset = vsec;
>> + pcie_pmu->nr_lanes = pcie_get_width_cap(pdev);
>> + pcie_pmu->on_cpu = -1;
>> + pcie_pmu->pmu = (struct pmu){
>> + .module = THIS_MODULE,
>> + .attr_groups = dwc_pcie_attr_groups,
>> + .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
>> + .task_ctx_nr = perf_invalid_context,
>> + .event_init = dwc_pcie_pmu_event_init,
>> + .add = dwc_pcie_pmu_event_add,
>> + .del = dwc_pcie_pmu_event_del,
>> + .start = dwc_pcie_pmu_event_start,
>> + .stop = dwc_pcie_pmu_event_stop,
>> + .read = dwc_pcie_pmu_event_update,
>> + };
>> +
>> + /* Add this instance to the list used by the offline callback */
>> + ret = cpuhp_state_add_instance(dwc_pcie_pmu_hp_state,
>> + &pcie_pmu->cpuhp_node);
>> + if (ret) {
>> + pci_err(pdev,
>> + "Error %d registering hotplug @%x\n", ret, bdf);
>> + goto out;
>> + }
>> +
>> + /* Unwind when platform driver removes */
>> + ret = devm_add_action_or_reset(
>> + &plat_dev->dev, dwc_pcie_pmu_remove_cpuhp_instance,
>> + &pcie_pmu->cpuhp_node);
>> + if (ret)
>> + goto out;
>> +
>> + ret = perf_pmu_register(&pcie_pmu->pmu, name, -1);
>> + if (ret) {
>> + pci_err(pdev,
>> + "Error %d registering PMU @%x\n", ret, bdf);
>> + goto out;
>> + }
>> +
>> + /* Cache PMU to handle pci device hotplug */
>> + list_add(&pcie_pmu->pmu_node, &dwc_pcie_pmu_head);
>> + pcie_pmu->registered = true;
>> + notify = true;
>> +
>> + ret = devm_add_action_or_reset(
>> + &plat_dev->dev, dwc_pcie_pmu_unregister_pmu, pcie_pmu);
>> + if (ret)
>> + goto out;
>> + }
>> +
>> + if (notify && !bus_register_notifier(&pci_bus_type, &dwc_pcie_pmu_nb))
>> + return devm_add_action_or_reset(
>> + &plat_dev->dev, dwc_pcie_pmu_unregister_nb, NULL);
>
> Maybe you can register the notifier firstly in the probe(). It'll be unregistered
> once failed to add the PMU. If no PMU registered it also should be ok since the
> PMU list will be empty and notifier callback will do nothing.
>
> This may address one potential race on driver removal. Since the notifier will be
> unregistered firstly

You are right, the added action will be released in reverse order.

> but the PMU's still registered and may have chance to
> access pointer to the root port. However it's so extreme so may never happen.
>

Good point, I will move to bus_register_notifier() to first order in probe().