Re: [RFC PATCH net v2 1/2] net/smc: Resolve the race between link group access and termination

From: Wen Gu
Date: Fri Jan 07 2022 - 07:04:44 EST


Thanks for your reply.

On 2022/1/7 5:54 pm, Karsten Graul wrote:
On 06/01/2022 14:02, Wen Gu wrote:
Thanks for your reply.

On 2022/1/5 8:03 pm, Karsten Graul wrote:
On 05/01/2022 09:27, Wen Gu wrote:
On 2022/1/3 6:36 pm, Karsten Graul wrote:
On 31/12/2021 10:44, Wen Gu wrote:
On 2021/12/29 8:56 pm, Karsten Graul wrote:
On 28/12/2021 16:13, Wen Gu wrote:
We encountered some crashes caused by the race between the access
and the termination of link groups.
So I am trying this way:

1) Introduce a new helper smc_conn_lgr_state() to check the three stages mentioned above.

  enum smc_conn_lgr_state {
         SMC_CONN_LGR_ORPHAN,    /* conn was never registered in a link group */
         SMC_CONN_LGR_VALID,     /* conn is registered in a link group now */
         SMC_CONN_LGR_INVALID,   /* conn was registered in a link group, but now
                                    is unregistered from it and conn->lgr should
                                    not be used any more */
  };

Sounds good, but is it really needed to separate 3 cases, i.e. who is really using them 3?
Doesn't it come down to a more simple smc_conn_lgr_valid() which is easier to implement in
the various places in the code? (i.e.: if (smc_conn_lgr_valid()) ....)
Valid would mean conn->lgr != NULL and conn->alert_token_local != 0. The more special cases
would check what they want by there own.

Yes, Most of the time we only need to check whether conn->lgr is in SMC_CONN_LGR_VALID.
Only in smc_conn_free() we need to identify whether conn->lgr is in SMC_CONN_LGR_ORPHAN
(need a directly return) or SMC_CONN_LGR_INVALID (put link group refcnt and then return).

So I agree with only checking whether conn->lgr is valid with a more simple smc_conn_lgr_valid().
And distinguish SMC_CONN_LGR_ORPHAN and SMC_CONN_LGR_INVALID cases by additional check for
conn->lgr.

Thanks,
Wen Gu