Re: [PATCH net] ax25: fix use-after-free bugs caused by ax25_ds_del_timer

From: duoming
Date: Thu Mar 28 2024 - 22:05:02 EST


On Thu, 28 Mar 2024 18:12:50 +0000 Simon Horman wrote:
> > > > When the ax25 device is detaching, the ax25_dev_device_down()
> > > > calls ax25_ds_del_timer() to cleanup the slave_timer. When
> > > > the timer handler is running, the ax25_ds_del_timer() that
> > > > calls del_timer() in it will return directly. As a result,
> > > > the use-after-free bugs could happen, one of the scenarios
> > > > is shown below:
> > > >
> > > > (Thread 1) | (Thread 2)
> > > > | ax25_ds_timeout()
> > > > ax25_dev_device_down() |
> > > > ax25_ds_del_timer() |
> > > > del_timer() |
> > > > ax25_dev_put() //FREE |
> > > > | ax25_dev-> //USE
> > > >
> > > > In order to mitigate bugs, when the device is detaching, use
> > > > timer_shutdown_sync() to stop the timer.
> > >
> > > FWIIW, in my reading of things there is another failure mode whereby
> > > ax25_ds_timeout may rearm the timer after the call to del_timer() but
> > > before the call to ax25_dev_put().
> >
> > I think using timer_shutdown_sync() or del_timer_sync() to replace del_timer()
> > could prevent the rearm.
>
> I think only timer_shutdown() and timer_shutdown_sync() will prevent a
> rearm. But I also think (but am not entirely sure) this is only important
> in the ax25_dev_device_down() case (there are others, as you mention
> below).

When timer is rearmed in it's handler, the del_timer_sync() could prevent the
rearming. But when timer is rearmed in other threads, the del_timer_sync() could
not prevent it. The following code is apart of the del_timer_sync().

do {
ret = __try_to_del_timer_sync(timer, shutdown);

if (unlikely(ret < 0)) {
del_timer_wait_running(timer);
cpu_relax();
}
} while (ret < 0);

In the ax25_dev_device_down() case, I think it is better to use
timer_shutdown_sync().

> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Duoming Zhou <duoming@xxxxxxxxxx>
> > > > ---
> > > > net/ax25/ax25_ds_timer.c | 7 ++++++-
> > > > 1 file changed, 6 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c
> > > > index c4f8adbf814..5624c0d174c 100644
> > > > --- a/net/ax25/ax25_ds_timer.c
> > > > +++ b/net/ax25/ax25_ds_timer.c
> > > > @@ -43,7 +43,12 @@ void ax25_ds_setup_timer(ax25_dev *ax25_dev)
> > > >
> > > > void ax25_ds_del_timer(ax25_dev *ax25_dev)
> > > > {
> > > > - if (ax25_dev)
> > > > + if (!ax25_dev)
> > > > + return;
> > > > +
> > > > + if (!ax25_dev->device_up)
> > > > + timer_shutdown_sync(&ax25_dev->dama.slave_timer);
> > > > + else
> > > > del_timer(&ax25_dev->dama.slave_timer);
> > > > }
> > >
> > > I think that a) it is always correct to call timer_shutdown_sync,
> > > and b) ax25_dev->device_up is always true. So a call to
> > > timer_shutdown_sync can simply replace the call to del_timer.
> >
> > I think timer_shutdown*() is used for the code path to clean up the
> > driver or detach the device. If timer is shut down by timer_shutdown*(),
> > it could not be re-armed again unless we reinitialize the timer. The
> > slave_timer should only be shut down when the ax25 device is detaching or
> > the driver is removing. And it should not be shut down in other scenarios,
> > such as called in ax25_ds_state2_machine() or ax25_ds_state3_machine().
> > So I think calling timer_shutdown_sync() is not always correct.
> >
> > What's more, the ax25_dev->device_up is not always true. It is set to
> > false in ax25_kill_by_device().
> >
> > In a word, the timer_shutdown_sync() could not replace the del_timer()
> > completely.
>
> Yes, sorry. I missed that ax25_ds_del_timer() is not
> only called from ax25_dev_device_down().
>
> > > Also, not strictly related, I think ax25_dev cannot be NULL,
> > > so that check could be dropped. But perhaps that is better left alone.
> >
> > The ax25_dev cannot not be NULL, because we only use ax25_dev_put() to
> > free the ax25_dev instead of setting is to NULL. So I think the check
> > could be dropped.
> >
> > Do you think the following plan is proper?
> >
> > diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c
> > index c4f8adbf8144..f1cab4effa44 100644
> > --- a/net/ax25/ax25_ds_timer.c
> > +++ b/net/ax25/ax25_ds_timer.c
> > @@ -43,8 +43,7 @@ void ax25_ds_setup_timer(ax25_dev *ax25_dev)
> >
> > void ax25_ds_del_timer(ax25_dev *ax25_dev)
> > {
> > - if (ax25_dev)
> > - del_timer(&ax25_dev->dama.slave_timer);
> > + del_timer_sync(&ax25_dev->dama.slave_timer);
> > }
> >
> > There is no deadlock will happen.
>
> I'm actually getting to think that your original patch was correct.
> But perhaps a different approach would be to simply call
> timer_shutdown_sync() in ax25_dev_device_down(). And leave
> ax25_ds_del_timer() alone.

I think using timer_shutdown_sync() in ax25_dev_device_down() and
leaving ax25_ds_del_timer() alone is better than the original patch.

Thank you for your suggestions!

Best regards,
Duoming Zhou