Re: [PATCH RFC] binfmt_elf: fully allocate bss pages

From: Eric W. Biederman
Date: Mon Sep 25 2023 - 05:54:05 EST


Sebastian Ott <sebott@xxxxxxxxxx> writes:

> On Sun, 24 Sep 2023, Eric W. Biederman wrote:
>> Sebastian Ott <sebott@xxxxxxxxxx> writes:
>>
>>> Hej,
>>>
>>> since we figured that the proposed patch is not going to work I've spent a
>>> couple more hours looking at this (some static binaries on arm64 segfault
>>> during load [0]). The segfault happens because of a failed clear_user()
>>> call in load_elf_binary(). The address we try to write zeros to is mapped with
>>> correct permissions.
>>>
>>> After some experiments I've noticed that writing to anonymous mappings work
>>> fine and all the error cases happend on file backed VMAs. Debugging showed that
>>> in elf_map() we call vm_mmap() with a file offset of 15 pages - for a binary
>>> that's less than 1KiB in size.
>>>
>>> Looking at the ELF headers again that 15 pages offset originates from the offset
>>> of the 2nd segment - so, I guess the loader did as instructed and that binary is
>>> just too nasty?
>>>
>>> Program Headers:
>>> Type Offset VirtAddr PhysAddr
>>> FileSiz MemSiz Flags Align
>>> LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
>>> 0x0000000000000178 0x0000000000000178 R E 0x10000
>>> LOAD 0x000000000000ffe8 0x000000000041ffe8 0x000000000041ffe8
>>> 0x0000000000000000 0x0000000000000008 RW 0x10000
>>> NOTE 0x0000000000000120 0x0000000000400120 0x0000000000400120
>>> 0x0000000000000024 0x0000000000000024 R 0x4
>>> GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
>>> 0x0000000000000000 0x0000000000000000 RW 0x10
>>>
>>> As an additional test I've added a bunch of zeros at the end of that binary
>>> so that the offset is within that file and it did load just fine.
>>>
>>> On the other hand there is this section header:
>>> [ 4] .bss NOBITS 000000000041ffe8 0000ffe8
>>> 0000000000000008 0000000000000000 WA 0 0 1
>>>
>>> "sh_offset
>>> This member's value gives the byte offset from the beginning of the file to
>>> the first byte in the section. One section type, SHT_NOBITS described
>>> below, occupies no space in the file, and its sh_offset member locates
>>> the conceptual placement in the file.
>>> "
>>>
>>> So, still not sure what to do here..
>>>
>>> Sebastian
>>>
>>> [0] https://lore.kernel.org/lkml/5d49767a-fbdc-fbe7-5fb2-d99ece3168cb@xxxxxxxxxx/
>>
>> I think that .bss section that is being generated is atrocious.
>>
>> At the same time I looked at what the linux elf loader is trying to do,
>> and the elf loader's handling of program segments with memsz > filesz
>> has serious remnants a.out of programs allocating memory with the brk
>> syscall.
>>
>> Lots of the structure looks like it started with the assumption that
>> there would only be a single program header with memsz > filesz the way
>> and that was the .bss. The way things were in the a.out days and
>> handling of other cases has been debugged in later.
>>
>> So I have modified elf_map to always return successfully when there is
>> a zero filesz in the program header for an elf segment.
>>
>> Then I have factored out a function clear_tail that ensures the zero
>> padding for an entire elf segment is present.
>>
>> Please test this and see if it causes your test case to work.
>
> Sadly, that causes issues for other programs:

Bah. Too much cleanup at once.

I will respin.

> [ 44.164596] Run /init as init process
> [ 44.168763] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [ 44.176409] CPU: 32 PID: 1 Comm: init Not tainted 6.6.0-rc2+ #89
> [ 44.182404] Hardware name: GIGABYTE R181-T92-00/MT91-FS4-00, BIOS F34 08/13/2020
> [ 44.189786] Call trace:
> [ 44.192220] dump_backtrace+0xa4/0x130
> [ 44.195961] show_stack+0x20/0x38
> [ 44.199264] dump_stack_lvl+0x48/0x60
> [ 44.202917] dump_stack+0x18/0x28
> [ 44.206219] panic+0x2e0/0x350
> [ 44.209264] do_exit+0x370/0x390
> [ 44.212481] do_group_exit+0x3c/0xa0
> [ 44.216044] get_signal+0x800/0x808
> [ 44.219521] do_signal+0xfc/0x200
> [ 44.222824] do_notify_resume+0xc8/0x418
> [ 44.226734] el0_da+0x114/0x120
> [ 44.229866] el0t_64_sync_handler+0xb8/0x130
> [ 44.234124] el0t_64_sync+0x194/0x198
> [ 44.237776] SMP: stopping secondary CPUs
> [ 44.241740] Kernel Offset: disabled
> [ 44.245215] CPU features: 0x03000000,14028142,10004203
> [ 44.250342] Memory Limit: none
> [ 44.253383] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

Eric