Re: [Intel-gfx] signal: break out of wait loops on kthread_stop()

From: Eric W. Biederman
Date: Wed Nov 30 2022 - 11:01:18 EST


"Jason A. Donenfeld" <Jason@xxxxxxxxx> writes:

> Hi Eric,
>
> On Mon, Nov 28, 2022 at 7:22 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>>
>> Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> writes:
>>
>> > On 19/10/2022 21:19, Jason A. Donenfeld wrote:
>> >> On Wed, Oct 19, 2022 at 09:09:28PM +0100, Tvrtko Ursulin wrote:
>> >>> Hm why is kthread_stop() after kthread_run() abuse? I don't see it in
>> >>> kerneldoc that it must not be used for stopping threads.
>> >> Because you don't want it to stop. You want to wait until it's done. If
>> >> you call stop right after run, it will even stop it before it even
>> >> begins to run. That's why you wind up sprinkling your msleeps
>> >> everywhere, indicating that clearly this is not meant to work that way.
>> > Not after kthread_run which wakes it up already. If the kerneldoc for
>> > kthread_stop() is correct at least... In which case I really do think
>> > that the yields are pointless/red herring. Perhaps they predate kthread_run and
>> > then they were even wrong.
>> >
>> >>> Yep the yields and sleeps are horrible and will go. But they are also
>> >>> not relevant for the topic at hand.
>> >> Except they very much are. The reason you need these is because you're
>> >> using kthread_stop() for something it's not meant to do.
>> >
>> > It is supposed to assert kthread_should_stop() which thread can look at as when
>> > to exit. Except that now it can fail to get to that controlled exit
>> > point. Granted that argument is moot since it implies incomplete error handling
>> > in the thread anyway.
>> >
>> > Btw there are actually two use cases in our code base. One is thread controls
>> > the exit, second is caller controls the exit. Anyway...
>> >
>> >>> Never mind, I was not looking for anything more than a suggestion on how
>> >>> to maybe work around it in piece as someone is dealing with the affected
>> >>> call sites.
>> >> Sultan's kthread_work idea is probably the right direction. This would
>> >> seem to have what you need.
>> >
>> > ... yes, it can be converted. Even though for one of the two use cases we need
>> > explicit signalling. There now isn't anything which would assert
>> > kthread_should_stop() without also asserting the signal, right?. Neither
>> > I found that the thread work API can do it.
>> >
>> > Fingers crossed we were the only "abusers" of the API. There's a quite a number
>> > of kthread_stop callers and it would be a large job to audit them all.
>>
>>
>> I have been out and am coming to this late. Did this get resolved?
>>
>>
>> I really don't expect this affected much of anything else as the code
>> sat in linux-next for an entire development cycle before being merged.
>>
>> But I would like to make certain problems with this change were resolved.
>
> I just checked drm-next, and it looks like the i915 people resolved
> their issue, and also got rid of those pesky yield()s in the process:
> https://cgit.freedesktop.org/drm/drm/commit/?id=6407cf533217e09dfd895e64984c3f1ee3802373
>

Thank you for verifying this has been resolved.

Eric