Re: [PATCH] x86_64: uninline TASK_SIZE

From: Andy Lutomirski
Date: Mon Apr 22 2019 - 20:54:53 EST


On Mon, Apr 22, 2019 at 3:09 PM Alexey Dobriyan <adobriyan@xxxxxxxxx> wrote:
>
> On Mon, Apr 22, 2019 at 07:30:40AM -0700, Andy Lutomirski wrote:
> >
> >
> > > On Apr 22, 2019, at 3:34 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> > >
> > >
> > > * Alexey Dobriyan <adobriyan@xxxxxxxxx> wrote:
> > >
> > >>>>> +++ b/arch/x86/kernel/task_size_64.c
> > >>>>> @@ -0,0 +1,9 @@
> > >>>>> +#include <linux/export.h>
> > >>>>> +#include <linux/sched.h>
> > >>>>> +#include <linux/thread_info.h>
> > >>>>> +
> > >>>>> +unsigned long _task_size(void)
> > >>>>> +{
> > >>>>> + return test_thread_flag(TIF_ADDR32) ? IA32_PAGE_OFFSET :
> > >>>> TASK_SIZE_MAX;
> > >>>>> +}
> > >>>>> +EXPORT_SYMBOL(_task_size);
> > >>>>
> > >>>> Good idea - but instead of adding yet another compilation unit, why not
> > >>>>
> > >>>> stick _task_size() into arch/x86/kernel/process_64.c, which is the
> > >>>> canonical place for process management related arch functions?
> > >>>>
> > >>>> Thanks,
> > >>>>
> > >>>> Ingo
> > >>>
> > >>> Better yet... since TIF_ADDR32 isn't something that changes randomly,
> > >>> perhaps this should be a separate variable?
> > >>
> > >> Maybe. I only thought about putting every 32-bit related flag under
> > >> CONFIG_COMPAT to further eradicate bloat (and force everyone else to
> > >> keep an eye on it, ha-ha).
> > >
> > > Basically TIF_ADDR32 is only set for a task if set_personality_ia32() is
> > > called, which function is called in the following circumstances:
> > >
> > > - arch/x86/ia32/ia32_aout.c:load_aout_binary()
> > >
> > > This is in exec(), when a new binary is loaded for the current task,
> > > via search_binary_handler() and exec_binprm(). Ordering is
> > > synchronous, AFAICS there can be no race between TASK_SIZE users and
> > > the set_personality_ia32() call which is always for the current task.
> > >
> > > - in COMPAT_SET_PERSONALITY(), which through macro detours ends up being
> > > in SET_PERSONALITY2(), which is used in fs/compat_binfmt_elf.c's
> > > load_elf_binary(), used in a similar fashion in exec() as the AOUT
> > > case above. One particular macro detour of note is that
> > > fs/compat_binfmt_elf.c #includes fs/binfmt_elf.c and re-defines the
> > > personality setting method to map to set_personality_ia32().
> > >
> > > When set_personality_ia32() is called then TIF_ADDR32 is set
> > > unconditionally, without any Kconfig variations.
> > >
> > > TIF_ADDR32 is cleared:
> > >
> > > - In set_personality_64bit(), when a 64-bit binary is loaded via
> > > fs/binfmt_elf.c.
> > >
> > > - It also defaults to clear in the init task, which is inherited by the
> > > initial kernel threads and any user-space task they might end up
> > > executing.
> > >
> > > So the conclusion is that IMO we can safely put TASK_SIZE into a new
> > > thread_info()->task_size field, and:
> > >
> > > - change ->task_size to the 32-bit address space in
> > > set_personality_ia32()
> > >
> > > - change ->task_size to teh 64-bit address space in the init task and in
> > > set_personality_64bit().
> > >
> > > This should cover it I think, unless I missed something.
> > >
> >
> > Are there really enough TASK_SIZE users to justify any of this?
>
> Saving 2KB on a defconfig is quite a lot.

Saving 2kB of text by adding 8 bytes to thread_info seems rather
dubious to me. You only need 256 tasks before you lose. My
not-particularly-loaded laptop has 865 tasks right now.

As a general principle, the mere existence of TIF_ADDR32 is a bug.
The value of that flag is *wrong* under the 32-bit variant of CRIU.
How about instead making some more progress toward getting rid of
dubious TASK_SIZE users? I'm working on a little series to get rid of
most of them. Meanwhile: it sure looks like a large fraction of the
users are confused as to whether TASK_SIZE is the highest user address
or the lowest non-user address.