Re: [RFC PATCH net v2 1/2] net/smc: Resolve the race between link group access and termination

From: Karsten Graul
Date: Fri Jan 07 2022 - 04:54:35 EST


On 06/01/2022 14:02, Wen Gu wrote:
> Thanks for your reply.
>
> On 2022/1/5 8:03 pm, Karsten Graul wrote:
>> On 05/01/2022 09:27, Wen Gu wrote:
>>> On 2022/1/3 6:36 pm, Karsten Graul wrote:
>>>> On 31/12/2021 10:44, Wen Gu wrote:
>>>>> On 2021/12/29 8:56 pm, Karsten Graul wrote:
>>>>>> On 28/12/2021 16:13, Wen Gu wrote:
>>>>>>> We encountered some crashes caused by the race between the access
>>>>>>> and the termination of link groups.
> So I am trying this way:
>
> 1) Introduce a new helper smc_conn_lgr_state() to check the three stages mentioned above.
>
>   enum smc_conn_lgr_state {
>          SMC_CONN_LGR_ORPHAN,    /* conn was never registered in a link group */
>          SMC_CONN_LGR_VALID,     /* conn is registered in a link group now */
>          SMC_CONN_LGR_INVALID,   /* conn was registered in a link group, but now
>                                     is unregistered from it and conn->lgr should
>                                     not be used any more */
>   };
>
> 2) replace the current conn->lgr check with the new helper.
>
> These new changes are under testing now.
>
> What do you think about it? :)

Sounds good, but is it really needed to separate 3 cases, i.e. who is really using them 3?
Doesn't it come down to a more simple smc_conn_lgr_valid() which is easier to implement in
the various places in the code? (i.e.: if (smc_conn_lgr_valid()) ....)
Valid would mean conn->lgr != NULL and conn->alert_token_local != 0. The more special cases
would check what they want by there own.