Re: [RFC PATCH net v2 1/2] net/smc: Resolve the race between link group access and termination

From: Karsten Graul
Date: Mon Jan 03 2022 - 05:37:13 EST


On 31/12/2021 10:44, Wen Gu wrote:
> On 2021/12/29 8:56 pm, Karsten Graul wrote:
>> On 28/12/2021 16:13, Wen Gu wrote:
>>> We encountered some crashes caused by the race between the access
>>> and the termination of link groups.
> What do you think about it?
>

Hi Wen,

thank you, and I also wish you and your family a happy New Year!

Thanks for your detailed explanation, you convinced me of your idea to use
a reference counting! I think its a good solution for the various problems you describe.

I am still thinking that even if you saw no problems when conn->lgr is not NULL when the lgr
is already terminated there should be more attention on the places where conn->lgr is checked.
For example, in smc_cdc_get_slot_and_msg_send() there is a check for !conn->lgr with the intention
to avoid working with a terminated link group.
Should all checks for !conn->lgr be now replaced by the check for conn->freed ?? Does this make sense?