Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

From: Logan Gunthorpe
Date: Mon Mar 05 2018 - 15:13:48 EST




On 05/03/18 12:57 PM, Sagi Grimberg wrote:
Keith, while we're on this, regardless of cmb, is SQE memcopy and DB update ordering always guaranteed?

If you look at mlx4 (rdma device driver) that works exactly the same as
nvme you will find:
--
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ qp->sq.head += nreq;

ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ /*
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ * Make sure that descriptors are written before
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ * doorbell record.
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ */
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ wmb();

ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ writel(qp->doorbell_qpn,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ to_mdev(ibqp->device)->uar_map + MLX4_SEND_DOORBELL);

ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ /*
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ * Make sure doorbells don't leak out of SQ spinlock
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ * and reach the HCA out of order.
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ */
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ mmiowb();
--

To me, it looks like the wmb() is redundant as writel should guarantee the order. (Indeed, per Sinan's comment, writel on arm64 starts with a wmb() which means, on that platform, there are two wmb() calls in a row.)

The mmiowb() call, on the other hand, looks correct per my understanding of it's purpose with respect to the spinlock.

Logan