Re: [PATCH v5 4/7] dmaengine: Add provider documentation on cookie assignment

From: Walker, Benjamin
Date: Fri Oct 21 2022 - 13:34:08 EST


On 10/19/2022 9:12 PM, Vinod Koul wrote:
On 19-10-22, 10:21, Walker, Benjamin wrote:
On 10/19/2022 9:34 AM, Vinod Koul wrote:
On 29-08-22, 13:35, Ben Walker wrote:
Clarify the rules on assigning cookies to DMA transactions.

Signed-off-by: Ben Walker <benjamin.walker@xxxxxxxxx>
---
.../driver-api/dmaengine/provider.rst | 45 +++++++++++++++----
1 file changed, 37 insertions(+), 8 deletions(-)

diff --git a/Documentation/driver-api/dmaengine/provider.rst b/Documentation/driver-api/dmaengine/provider.rst
index 1d0da2777921d..a5539f816d125 100644
--- a/Documentation/driver-api/dmaengine/provider.rst
+++ b/Documentation/driver-api/dmaengine/provider.rst
@@ -417,7 +417,9 @@ supported.
- tx_submit: A pointer to a function you have to implement,
that is supposed to push the current transaction descriptor to a
- pending queue, waiting for issue_pending to be called.
+ pending queue, waiting for issue_pending to be called. Each
+ descriptor is given a cookie to identify it. See the section
+ "Cookie Management" below.
- In this structure the function pointer callback_result can be
initialized in order for the submitter to be notified that a
@@ -522,6 +524,40 @@ supported.
- May sleep.
+Cookie Management
+------------------
+
+When a transaction is queued for submission via tx_submit(), the provider
+must assign that transaction a cookie (dma_cookie_t) to uniquely identify it.
+The provider is allowed to perform this assignment however it wants, but for

We assumes that we have monotonically increasing cookie and
if cookie 10 is marked complete cookie 8 is assumed complete too...

That's exactly what this patch series is changing. The earlier patches make
changes to no longer report to the client the "last" or "used" cookie (to
compare against) in the client APIs, and it turns out that nothing in the
kernel actually cares about this behavior. So it's simply a documentation
change to indicate that the client no longer has any visibility into the
cookie behavior.

Not really, there are some engines which will notify that descriptor X
completed which also implies that all descriptors before X have
completed as well...

If we change the default behaviour, we risk breaking those.

I actually don't believe it's true that any clients rely on this behavior today. Certainly, that's the defined behavior prior to this patch series and a client could have relied on that. But I did a big audit and I don't believe any of them actually do. Prior to submitting this patch series I was thinking I needed to create new APIs that code could opt into and convert over to gradually, but it seems we're fortunate enough to get away with just changing the documentation.

As a quick justification, it's worth doing the work to audit and confirm all of this because this is such an important change for the future usefulness of the dmaengine framework. Modern DMA devices are best used by polling for completions, and they certainly can complete out of order. As more of the kernel moves to performing asynchronous operations (mostly via io_uring), this is becoming very important. The rest of this email is me repeating my big audit and taking notes along the way. I apologize if it's long, but it's important to document the findings.

If we look at the client-facing API, we can identify all of the points at which a cookie is returned to the user or consumed by the the API as input.


As input:
dma_submit_error
dmaengine_tx_status
dma_async_is_tx_complete
dmaengine_is_tx_complete
dma_sync_wait

As a returned value:
dmaengine_submit
dmaengine_tx_status (via the returned state parameter)
dma_async_is_tx_complete (via last/user parameters)

It's also in the following data structures (which are visible to clients):
dma_chan
dma_async_tx_descriptor (returned from the tx_submit function pointer)
dma_tx_state (only returned by dmaengine_tx_status)

So auditing all of those uses:
- dma_submit_error doesn't assume it's monotonically increasing

- dmaengine_tx_status itself doesn't assume (in the generic dmaengine code) it's monotonically increasing. Providers implementing this call may assume that, but they're in control of making it so. This call can also return cookies via the optional state parameter. Except every call either passes NULL for state to ignore it, or allocates state on the stack and never stores it outside of the local function. Within those functions, only state.residue is ever used - the cookies are never touched.

- dma_sync_wait is called in 5 places. In 3 places it's called immediately after a dmaengine_submit and the cookie is only ever on the stack and never compared to anything. The other two spots are during shutdown in ntb_transport_free_queue(). All it's doing here is waiting for the last *submitted* cookie to finish, then aborting anything that is still outstanding. This driver already works with devices that complete out of order (idxd), so it has a comment saying that waiting for the last submitted may not wait for all, and that's why it does the abort. No issue there.

- dmaengine_is_tx_complete isn't used anywhere. We just added it in this series. It's intended to replace dma_async_is_tx_complete.

- dma_async_is_tx_complete is called in 4 places:
-- stm32-hash.c calls dmaengine_submit to get a cookie, then calls dma_async_is_tx_complete with that value. The cookie only exists on the stack and it's value is never compared with anything. The last/used return values are not used.

-- rio_mport_cdev.c calls dmaengine_submit and stores the cookie into a request object. That's then passed into dma_async_is_tx_complete later and the last/used parameters are not captured. This cookie is only compared to other cookies using equality, so this one is safe.

-- omap_vout_vrfb.c is the same story as stm32-hash.c. The cookie is only used within a single function and it's never compared to another cookie.

-- pxa_camera.c does appear to rely on cookie values monotonically increasing. Except we get off easy here, because this driver only works with dma channels for one specific provider. It can't use just any provider. This particular provider elects to make it's cookies monotonically increasing still, so nothing breaks. In general, I have some real concerns on layering in this driver since the DMA engine it's using does not appear to be generic and instead only works with this particular camera device. I don't feel like it should be using the dmaengine framework at all.

- dmaengine_submit returns a cookie to the user, and the remaining uses of the cookie are embedded into structs. To audit these I created a patch that changes the type of dma_cookie_t to a struct like so:

typedef struct {
s32 val;
} dma_cookie_t;

I then fixed up the utility functions in the dmaengine framework, commented out all of the printk stuff that was complaining about casting a struct to %d, and let the compiler find all of the places where math or comparisons were performed on it. Filtering out comparisons against 0 to detect errors, assignments to negative values, equality comparisons to other cookies, and any uses by DMA providers, which all still work after this patch series, we're left with... nothing.

So the summary is:

- pxa_camera is the only client that cares about the cookie behavior, but it's tied in to exactly one provider that happens to do the cookies the way it wants. This patch series doesn't force any provider to change what it does currently.

It really is the case that none of the clients care about the cookie behavior, and we really can just make a documentation change to modify cookies to become opaque handles.


Immediately below here the documentation then says that there's some
convenience functions that providers can use that do produce monotonically
increasing cookies. These are now optional for providers to use, if they
find them useful, rather than the required way to manage the cookies.


Completion is always in order unless we specify DMA_COMPLETION_NO_ORDER

The final patch in this series eliminates DMA_COMPLETION_NO_ORDER entirely.
It was only used by the IDXD driver, and the reason I'm doing these patches
is so that we can poll the IDXD driver for completions even though it can
complete out of order.