Re: [PATCH v1 1/3] PCI: qcom: Enable cache coherency for SA8775P RC

From: Mrinmay Sarkar
Date: Mon Nov 06 2023 - 02:20:16 EST



On 11/3/2023 1:28 PM, Manivannan Sadhasivam wrote:
On Thu, Nov 02, 2023 at 11:25:36PM +0100, Konrad Dybcio wrote:

On 02/11/2023 17:36, Manivannan Sadhasivam wrote:
On Thu, Nov 02, 2023 at 05:34:24PM +0200, Dmitry Baryshkov wrote:
On Tue, 31 Oct 2023 at 17:46, Mrinmay Sarkar <quic_msarkar@xxxxxxxxxxx> wrote:
This change will enable cache snooping logic to support
cache coherency for SA8755P RC platform.

Signed-off-by: Mrinmay Sarkar <quic_msarkar@xxxxxxxxxxx>
---
drivers/pci/controller/dwc/pcie-qcom.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 6902e97..6f240fc 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -51,6 +51,7 @@
#define PARF_SID_OFFSET 0x234
#define PARF_BDF_TRANSLATE_CFG 0x24c
#define PARF_SLV_ADDR_SPACE_SIZE 0x358
+#define PCIE_PARF_NO_SNOOP_OVERIDE 0x3d4
#define PARF_DEVICE_TYPE 0x1000
#define PARF_BDF_TO_SID_TABLE_N 0x2000

@@ -117,6 +118,9 @@
/* PARF_LTSSM register fields */
#define LTSSM_EN BIT(8)

+/* PARF_NO_SNOOP_OVERIDE register value */
+#define NO_SNOOP_OVERIDE_EN 0xa
+
/* PARF_DEVICE_TYPE register fields */
#define DEVICE_TYPE_RC 0x4

@@ -961,6 +965,13 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie)

static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie)
{
+ struct dw_pcie *pci = pcie->pci;
+ struct device *dev = pci->dev;
+
+ /* Enable cache snooping for SA8775P */
+ if (of_device_is_compatible(dev->of_node, "qcom,pcie-sa8775p"))
Obviously: please populate a flag in the data structures instead of
doing of_device_is_compatible(). Same applies to the patch 2.

Not necessary at this point. For some unknown reasons, the HW team ended up
disabling cache snooping on this specific platform. Whereas on other platforms,
it is enabled by default. So I have low expectations that we would need this
setting on other platforms in the future.

My concern with the usage of flag is that it warrants a new "qcom_pcie_cfg"
instance just for this quirk and it looks overkill to me.

So if we endup seeing this behavior on other platforms as well (unlikely) then
we can switch to the flag approach.
This register reads zeroes on 8250, can we confirm it works as
expected there?
I don't know if this register is even implemented in 8250. Mrinmay, can you
check?
Yes we have this register in 8250 platform as well
and I can see the default value is 0x0.

--Mrinmay
I guess some benchmarks with and without
'dma-coherent'?

The performance benefit can be measured by saturating the link. But it is
obvious that snooping the cache will give better performance (plus it avoids
cache flush in kernel).

- Mani