Re: timer list corruption in devfreq

From: Tejun Heo
Date: Thu Nov 09 2023 - 14:08:44 EST


Hello,

On Wed, Nov 08, 2023 at 09:39:57PM +0530, Mukesh Ojha wrote:
> We are facing an issue on 6.1 kernel while using devfreq framework
> and looks like the devfreq_monitor_stop()/devfreq_monitor_start is
> vulnerable if frequent governor change is being done from user space
> in a loop.
>
> echo simple_ondemand > /sys/class/devfreq/1d84000.ufshc/governor
> echo performance > /sys/class/devfreq/1d84000.ufshc/governor
>
> Here, we are using ufs device, but could be any device.
>
> Issue is because same instance of timer is being queued from two
> places one from devfreq_monitor() and one from devfreq_monitor_start() as
> cancel_delayed_work_sync() from devfreq_monitor_stop() was not
> able to delete the delayed work time completely due to which
> devfreq_monitor() work rearmed the same timer.
>
> But there looks to be issue in the timer framework where
> it was initially discussed in [1] and later fixed in [2]
> but not sure being whether is it issue in cancel_delayed_work_sync()
> where del_timer() inside try_to_grab_pending() need to be replaced
> with timer_delete[_sync]() or devfreq_monitor_stop() need to use
> this api's and then delete the work.

So, having shutdown can be more convenient in some cases and that'd be a
useful addition to workqueue both for immediate and delayed work items. That
said, that's usually not essential in fixing these issues - e.g. Can't you
just synchronize devfreq_monitor_start() and stop()?

Thanks.

--
tejun