Re: [PATCH v2] MIPS: DEC: Restore bootmem reservation for firmware working memory area

From: Serge Semin
Date: Wed Oct 14 2020 - 22:14:46 EST


Maciej,
Regardless of a comment below feel free to add:
Reviewed-by: Serge Semin <fancer.lancer@xxxxxxxxx>

On Wed, Oct 14, 2020 at 10:34:56PM +0100, Maciej W. Rozycki wrote:
> Fix a crash on DEC platforms starting with:
>
> VFS: Mounted root (nfs filesystem) on device 0:11.
> Freeing unused PROM memory: 124k freed
> BUG: Bad page state in process swapper pfn:00001
> page:(ptrval) refcount:0 mapcount:-128 mapping:00000000 index:0x1 pfn:0x1
> flags: 0x0()
> raw: 00000000 00000100 00000122 00000000 00000001 00000000 ffffff7f 00000000
> page dumped because: nonzero mapcount
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper Not tainted 5.9.0-00858-g865c50e1d279 #1
> Stack : 8065dc48 0000000b 8065d2b8 9bc27dcc 80645bfc 9bc259a4 806a1b97 80703124
> 80710000 8064a900 00000001 80099574 806b116c 1000ec00 9bc27d88 806a6f30
> 00000000 00000000 80645bfc 00000000 31232039 80706ba4 2e392e35 8039f348
> 2d383538 00000070 0000000a 35363867 00000000 806c2830 80710000 806b0000
> 80710000 8064a900 00000001 81000000 00000000 00000000 8035af2c 80700000
> ...
> Call Trace:
> [<8004bc5c>] show_stack+0x34/0x104
> [<8015675c>] bad_page+0xfc/0x128
> [<80157714>] free_pcppages_bulk+0x1f4/0x5dc
> [<801591cc>] free_unref_page+0xc0/0x130
> [<8015cb04>] free_reserved_area+0x144/0x1d8
> [<805abd78>] kernel_init+0x20/0x100
> [<80046070>] ret_from_kernel_thread+0x14/0x1c
> Disabling lock debugging due to kernel taint
>
> caused by an attempt to free bootmem space that as from commit
> b93ddc4f9156 ("mips: Reserve memory for the kernel image resources") has
> not been anymore reserved due to the removal of generic MIPS arch code
> that used to reserve all the memory from the beginning of RAM up to the
> kernel load address.
>
> This memory does need to be reserved on DEC platforms however as it is
> used by REX firmware as working area, as per the TURBOchannel firmware
> specification[1]:
>
> Table 2-2 REX Memory Regions
> -------------------------------------------------------------------------
> Starting Ending
> Region Address Address Use
> -------------------------------------------------------------------------
> 0 0xa0000000 0xa000ffff Restart block, exception vectors,
> REX stack and bss
> 1 0xa0010000 0xa0017fff Keyboard or tty drivers
>
> 2 0xa0018000 0xa001f3ff 1) CRT driver
>
> 3 0xa0020000 0xa002ffff boot, cnfg, init and t objects
>
> 4 0xa0020000 0xa002ffff 64KB scratch space
> -------------------------------------------------------------------------
> 1) Note that the last 3 Kbytes of region 2 are reserved for backward
> compatibility with previous system software.
> -------------------------------------------------------------------------
>

...

> @@ -146,6 +150,9 @@ void __init plat_mem_setup(void)
>
> ioport_resource.start = ~0UL;
> ioport_resource.end = 0UL;

> +
> + /* Stay away from the firmware working memory area for now. */
> + memblock_reserve(PHYS_OFFSET, __pa_symbol(&_text) - PHYS_OFFSET);

I am just wondering...
Here we reserve a region within [PHYS_OFFSET, __pa_symbol(&_text)]. But then in
the prom_free_prom_memory() method we'll get to free a region as either
[PAGE_SIZE, __pa_symbol(&_text)] or [PAGE_SIZE, __pa_symbol(&_text) - 0x00020000].

First of all the start address of the being freed region is PAGE_SIZE, which doesn't
take the PHYS_OFFSET into account. I assume it won't cause problems because
PHYS_OFFSET is set to 0 for DEC. If so then we either shouldn't use PHYS_OFFSET
here or should take PHYS_OFFSET into account there on freeing or should just forget
about that since other than confusion it won't cause any problem.)
(I should have posted this question to v1 of this patch instead of pointing out
on the confusing size argument of the memblock_reserve() method. Sorry about
that.)

Secondly why is PAGE_SIZE selected as the start address? It belongs to the
Region 1 in accordance with "Table 2-2 REX Memory Regions" cited above. Thus we
get to keep reserved just a part of the Region 1. Most likely it's the restart
block and the exception vectors. Even assuming that the DEC developers knew what
they were doing, I am wondering can we be sure that a single page is enough for
all that data?..

-Sergey

> }
>
> /*