Re: [RFC] Make use of non-dynamic dmabuf in RDMA

From: Gal Pressman
Date: Thu Sep 02 2021 - 02:57:19 EST


On 01/09/2021 14:24, Christian König wrote:
>
>
> Am 01.09.21 um 13:20 schrieb Gal Pressman:
>> On 24/08/2021 20:32, Jason Gunthorpe wrote:
>>> On Tue, Aug 24, 2021 at 10:27:23AM -0700, John Hubbard wrote:
>>>> On 8/24/21 2:32 AM, Christian König wrote:
>>>>> Am 24.08.21 um 11:06 schrieb Gal Pressman:
>>>>>> On 23/08/2021 13:43, Christian König wrote:
>>>>>>> Am 21.08.21 um 11:16 schrieb Gal Pressman:
>>>>>>>> On 20/08/2021 17:32, Jason Gunthorpe wrote:
>>>>>>>>> On Fri, Aug 20, 2021 at 03:58:33PM +0300, Gal Pressman wrote:
>>>> ...
>>>>>>>> IIUC, we're talking about three different exporter "types":
>>>>>>>> - Dynamic with move_notify (requires ODP)
>>>>>>>> - Dynamic with revoke_notify
>>>>>>>> - Static
>>>>>>>>
>>>>>>>> Which changes do we need to make the third one work?
>>>>>>> Basically none at all in the framework.
>>>>>>>
>>>>>>> You just need to properly use the dma_buf_pin() function when you start
>>>>>>> using a
>>>>>>> buffer (e.g. before you create an attachment) and the dma_buf_unpin()
>>>>>>> function
>>>>>>> after you are done with the DMA-buf.
>>>>>> I replied to your previous mail, but I'll ask again.
>>>>>> Doesn't the pin operation migrate the memory to host memory?
>>>>> Sorry missed your previous reply.
>>>>>
>>>>> And yes at least for the amdgpu driver we migrate the memory to host
>>>>> memory as soon as it is pinned and I would expect that other GPU drivers
>>>>> do something similar.
>>>> Well...for many topologies, migrating to host memory will result in a
>>>> dramatically slower p2p setup. For that reason, some GPU drivers may
>>>> want to allow pinning of video memory in some situations.
>>>>
>>>> Ideally, you've got modern ODP devices and you don't even need to pin.
>>>> But if not, and you still hope to do high performance p2p between a GPU
>>>> and a non-ODP Infiniband device, then you would need to leave the pinned
>>>> memory in vidmem.
>>>>
>>>> So I think we don't want to rule out that behavior, right? Or is the
>>>> thinking more like, "you're lucky that this old non-ODP setup works at
>>>> all, and we'll make it work by routing through host/cpu memory, but it
>>>> will be slow"?
>>> I think it depends on the user, if the user creates memory which is
>>> permanently located on the GPU then it should be pinnable in this way
>>> without force migration. But if the memory is inherently migratable
>>> then it just cannot be pinned in the GPU at all as we can't
>>> indefinately block migration from happening eg if the CPU touches it
>>> later or something.
>> So are we OK with exporters implementing dma_buf_pin() without migrating the
>> memory?
>
> I think so, yes.
>
>> If so, do we still want a move_notify callback for non-dynamic importers? A noop?
>
> Well we could make the move_notify callback optional, e.g. so that you get the
> new locking approach but still pin the buffers manually with dma_buf_pin().

Thanks Christian!
So the end result will look similar to the original patch I posted, where
peer2peer can be enabled without providing move_notify, correct?