Re: [PATCH RFC] kthread: Unify kernel_thread() and user_mode_thread()

From: Eric W. Biederman
Date: Mon May 15 2023 - 10:43:17 EST


Huacai Chen <chenhuacai@xxxxxxxxxx> writes:

> Hi, Eric,
>
> On Wed, May 10, 2023 at 11:45 PM Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
>>
>> Huacai Chen <chenhuacai@xxxxxxxxxxx> writes:
>>
>> > Commit 343f4c49f2438d8 ("kthread: Don't allocate kthread_struct for init
>> > and umh") introduces a new function user_mode_thread() for init and umh.
>> > But the name is a bit confusing because init and umh are indeed kernel
>> > threads at creation time, the real difference is "they will become user
>> > processes".
>>
>> No they are not "kernel threads" at creation time. At creation time
>> init and umh are threads running in the kernel.
>>
>> It is a very important distinction and you are loosing it.
>>
>> Because they don't have a kthread_struct such tasks in the kernel
>> are not allowed to depend on anything that is ``kthread''.
> Hmm, traditionally, we call a "task" without userland address space
> (i.e., the task_struct has no mm, it shares kernel's address space) as
> a kernel thread, so init and umh are kernel threads until they call
> kernel_execve().

No.

The important distinction is not the userland address space.

The important distinction is how such tasks interact with the rest of
the system.

It is true the mm does not initially have userspace content but
that does not change the fact that it is a valid userspace mm.

For scheduling, for signal delivery, and for everything else
these tasks are userspace tasks.

The very important detail is that it is not at kernel_execve time that
the distinction is made, but that it is made earlier when the thread
is created.

This is a subtle change from the way things used to work once upon a
time. But the way things used to work was buggy and racy. Deciding at
thread creation time what the thread will be used for, what limitations
etc is much less error prone.

We had this concept of kthread_create that used to create a special
class of tasks. What was special, and what extra could be done with
those tasks was defined by the presence "struct kthread" (my apologies
I mispoke when I said kthread_struct earlier).

Then because that specialness was needed on other tasks struct kthread
started to be added to tasks at run-time. That runtime addition of
struct kthread introduced races that complicated the code, and had
bugs.

> Of course in your patch a kernel thread should have a
> "kthread" struct (I can't grep "kthread_struct" so I suppose you are
> saying "kthread"), but I think the traditional definition is more
> natural for most people?

Natural and traditional is a silly argument. The fact is those are
tasks that ultimately run userspace code. That ability needs to
be decided upon at creation time to make them race free.

Therefore the old code and definition are wrong.

Eric