Re: [Intel-gfx] signal: break out of wait loops on kthread_stop()

From: Tvrtko Ursulin
Date: Thu Oct 20 2022 - 09:46:13 EST



On 19/10/2022 21:19, Jason A. Donenfeld wrote:
On Wed, Oct 19, 2022 at 09:09:28PM +0100, Tvrtko Ursulin wrote:
Hm why is kthread_stop() after kthread_run() abuse? I don't see it in
kerneldoc that it must not be used for stopping threads.

Because you don't want it to stop. You want to wait until it's done. If
you call stop right after run, it will even stop it before it even
begins to run. That's why you wind up sprinkling your msleeps
everywhere, indicating that clearly this is not meant to work that way.
Not after kthread_run which wakes it up already. If the kerneldoc for kthread_stop() is correct at least... In which case I really do think that the yields are pointless/red herring. Perhaps they predate kthread_run and then they were even wrong.

Yep the yields and sleeps are horrible and will go. But they are also
not relevant for the topic at hand.

Except they very much are. The reason you need these is because you're
using kthread_stop() for something it's not meant to do.

It is supposed to assert kthread_should_stop() which thread can look at as when to exit. Except that now it can fail to get to that controlled exit point. Granted that argument is moot since it implies incomplete error handling in the thread anyway.

Btw there are actually two use cases in our code base. One is thread controls the exit, second is caller controls the exit. Anyway...

Never mind, I was not looking for anything more than a suggestion on how
to maybe work around it in piece as someone is dealing with the affected
call sites.

Sultan's kthread_work idea is probably the right direction. This would
seem to have what you need.

... yes, it can be converted. Even though for one of the two use cases we need explicit signalling. There now isn't anything which would assert kthread_should_stop() without also asserting the signal, right?. Neither I found that the thread work API can do it.

Fingers crossed we were the only "abusers" of the API. There's a quite a number of kthread_stop callers and it would be a large job to audit them all.

Regards,

Tvrtko