Re: [PATCH] can: etas_es58x: change opened_channel_cnt's type from atomic_t to u8

From: Vincent MAILHOL
Date: Mon Feb 14 2022 - 00:38:22 EST


On Sun. 13 Feb 2022 at 00:57, Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx> wrote:
> On 12.02.2022 20:27:13, Vincent Mailhol wrote:
> > The driver uses an atomic_t variable: es58x_device:opened_channel_cnt
> > to keep track of the number of opened channels in order to only
> > allocate memory for the URBs when this count changes from zero to one.
> >
> > While the intent was to prevent race conditions, the choice of an
> > atomic_t turns out to be a bad idea for several reasons:
> >
> > - implementation is incorrect and fails to decrement
> > opened_channel_cnt when the URB allocation fails as reported in
> > [1].
> >
> > - even if opened_channel_cnt were to be correctly decremented,
> > atomic_t is insufficient to cover edge cases: there can be a race
> > condition in which 1/ a first process fails to allocate URBs
> > memory 2/ a second process enters es58x_open() before the first
> > process does its cleanup and decrements opened_channed_cnt. In
> > which case, the second process would successfully return despite
> > the URBs memory not being allocated.
> >
> > - actually, any kind of locking mechanism was useless here because
> > it is redundant with the network stack big kernel lock
> > (a.k.a. rtnl_lock) which is being hold by all the callers of
> > net_device_ops:ndo_open() and net_device_ops:ndo_close(). c.f. the
> > ASSERST_RTNL() calls in __dev_open() [2] and __dev_close_many()
> > [3].
> >
> > The atmomic_t is thus replaced by a simple u8 type and the logic to
> > increment and decrement es58x_device:opened_channel_cnt is simplified
> > accordingly fixing the bug reported in [1]. We do not check again for
> > ASSERST_RTNL() as this is already done by the callers.
> >
> > [1] https://lore.kernel.org/linux-can/20220201140351.GA2548@kili/T/#u
> > [2] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1463
> > [3] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1541
> >
> > Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X
> > CAN USB interfaces")
> > Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
> > Signed-off-by: Vincent Mailhol <mailhol.vincent@xxxxxxxxxx>
>
> Applied to can/testing.
>
> I you (or someone else) wants to increase their patch count feel free to
> convert the other USB CAN drivers from atomic_t to u8, too.

Actually, not so many drivers are impacted:

| $ grep -R atomic_t drivers/net/can/
| drivers/net/can/c_can/c_can.h: atomic_t sie_pending;
| drivers/net/can/usb/esd_usb2.c: atomic_t active_tx_jobs;
| drivers/net/can/usb/ems_usb.c: atomic_t active_tx_urbs;
| drivers/net/can/usb/gs_usb.c: atomic_t active_tx_urbs;
| drivers/net/can/usb/gs_usb.c: atomic_t active_channels;
| drivers/net/can/usb/mcba_usb.c: atomic_t free_ctx_cnt;
| drivers/net/can/usb/usb_8dev.c: atomic_t active_tx_urbs;
| drivers/net/can/usb/peak_usb/pcan_usb_core.h: atomic_t active_tx_urbs;
| drivers/net/can/usb/etas_es58x/es58x_core.h: atomic_t tx_urbs_idle_cnt;
| drivers/net/can/usb/etas_es58x/es58x_core.c: atomic_t *idle_cnt =
&es58x_dev->tx_urbs_idle_cnt;

The only relevant one seems to be the gs_usb with its atomic_t
active_channels. I looked at the code, the change to u8 shouldn’t
be too hard. But aside from that, I am also concerned by the
absence of an exit path in gs_can_open() to free the allocated
URB memory when an error occurs.

I will send a patch to change the active_channels from
atomic_t to u8, however, I will not rework the error path to
free the allocated URB memory.

Also, we need to double check that none of the drivers uses a
spinlock or mutex in their open() or close() functions. I gave it
a first glance and didn't find anything outstanding but I will
need to spend a bit of extra time on that to confirm.


Yours sincerely,
Vincent Mailhol