Re: [PATCH v3 4/4] RISC-V: Allow booting kernel from any 4KB aligned address

From: Anup Patel
Date: Thu Mar 28 2019 - 06:24:37 EST


On Thu, Mar 28, 2019 at 3:22 PM Anup Patel <anup@xxxxxxxxxxxxxx> wrote:
>
> On Thu, Mar 28, 2019 at 1:25 PM Mike Rapoport <rppt@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, Mar 27, 2019 at 12:54:41AM -0700, Christoph Hellwig wrote:
> > > On Mon, Mar 25, 2019 at 09:46:59PM +0530, Anup Patel wrote:
> > > > > Why do you even care about kernel mappings for non-existant ram.
> > > >
> > > > We care because there will always be some buggy kernel driver/code going
> > > > out-of-bound and accessing non-existent RAM. If we by default map all
> > > > possible kernel virtual address then behaviour of buggy accesses will be
> > > > unpredictable.
> > > >
> > > > Further, I think we should also make .text and .rodata sections of kernel
> > > > as read-only. This will protect kernel code and rodata.
> > >
> > > All of that is useful at the final_setup_vm() time - but none of it
> > > matters during early setup_vm where life is complicated.
> > >
> > > Mike suggested on the previous iteration that you only do smaller
> > > mappings when setting up the final mapping to avoid the ops churn,
> > > and I fully agree with him.
> > >
> > > So I would suggest we avoid complicated the fiddly early boot changes
> > > that just add complxity, and you instead redirect your efforts to
> > > say implemented proper ro and non-executable sections using 4k mappings
> > > in the final VM setup only. That should actuall lead to less code
> > > and complexity, and provide more benefits.
> >
> > It might be worth keeping trampoline_pg_dir if we are to split setup_vm().
> > Then setup_vm() will only initialize the trampoline_pg_dir and
> > final_setup_vm() will setup the swapper_pg_dir and switch to it.
> > Otherwise final_setup_vm() would need to update live mappings which might
> > be fragile.
> >
>
> We finally know the purpose trampoline_pg_dir page table.
>
> The trampoline_pg_dir is suppose to contain only one 2M/4M mapping
> to handle case where PAGE_OFFSET < load_address.
>
> For 64bit systems, the PAGE_OFFSET is very high value typically
> 0xFFFFxxxxxxxxxxxx compared to RAM start 0x80000000. It is very
> unlikely that we will have enormous RAM ending somewhere
> 0xFFFFxxxxxxxxxxxx.
>
> For 32bit systems, it is quite possible that bootloader loads kernel at
> load_address > 0x80000000. Let say PAGE_OFFSET = 0xC0000000
> and load_address = 0xC0100000. Now the instruction which enables
> MMU will be 0xC0100xxx and after enabling it will try to fetch next
> instruction 0xC0100xxx + 4 but 0xC0100000 maps to 0xC0200000
> as load_address > PAGE_OFFSET hence we will see wrong instruction
> after enabling MMU.
>
> (Note: Above explanation was provided by Anthony)
>
> I guess we will have to keep both trampoline_pg_dir and swapper_pg_dir
> init in setup_vm() because trampoline_pg_dir will only contain just
> one 2M/4M mapping.
>
> Regards,
> Anup

I think we should go ahead with your suggestion but with additional
constraint as follows:

1. The setup_vm() will map only vmlinux_start to vmlinux_end and
FDT for early mapping in trampoline_pg_dir
2. The setup_vm() will hit BUG_ON() if load_address is between
vmlinux_start and vmlinux_end.
3. The setup_vm_final() will create swapper_pg_dir from scratch
to avoid updating live mappings

The point2 above essentially means that on 32bit/64bit systems
the bootloader cannot load kernel in physical address range
PAGE_OFFSET to PAGE_OFFSET + (vmlinux_size) even if
it is a valid RAM physical address range.

Suggestions??

Regards,
Anup