Re: Bug report: kernel paniced while booting

From: Icenowy Zheng
Date: Mon Jun 05 2023 - 20:48:48 EST


在 2023-06-05星期一的 13:55 -0700,Atish Patra写道:
> On Mon, Jun 5, 2023 at 8:13 AM Sunil V L <sunilvl@xxxxxxxxxxxxxxxx>
> wrote:
> >
> > On Mon, Jun 05, 2023 at 04:25:06PM +0200, Alexandre Ghiti wrote:
> > > Hi Song,
> > >
> > > On Mon, Jun 5, 2023 at 12:52 PM Song Shuai
> > > <songshuaishuai@xxxxxxxxxxx> wrote:
> > > >
> > > > Description of problem:
> > > >
> > > > Booting Linux With RiscVVirtQemu edk2 firmware, a Store/AMO
> > > > page fault was trapped to trigger a kernel panic.
> > > > The entire log has been posted at this link :
> > > > https://termbin.com/nga4.
> > > >
> > > > You can reproduce it with the following step :
> > > >
> > > > 1. prepare the environment with
> > > >    - Qemu-virt:  v8.0.0 (with OpenSbi v1.2)
> > > >    - edk2 : at commit (2bc8545883 "UefiCpuPkg/CpuPageTableLib:
> > > > Reduce the number of random tests")
> > > >    - Linux : v6.4-rc1 and later version
> > > >
> > > > 2. start the Qemu virt board
> > > >
> > > > ```sh
> > > > $ cat ~/8_riscv/start_latest.sh
> > > > #!/bin/bash
> > > > /home/song/8_riscv/3_acpi/qemu/ooo/usr/local/bin/qemu-system-
> > > > riscv64 \
> > > >         -s -nographic -drive
> > > > file=/home/song/8_riscv/3_acpi/Build_virt/RiscVVirtQemu/RELEASE
> > > > _GCC5/FV/RISCV_VIRT.fd,if=pflash,format=raw,unit=1 \
> > > >         -machine virt,acpi=off -smp 2 -m 2G \
> > > >         -kernel
> > > > /home/song/9_linux/linux/00_rv_def/arch/riscv/boot/Image \
> > > >         -initrd
> > > > /home/song/8_riscv/3_acpi/buildroot/output/images/rootfs.ext2 \
> > > >         -append "root=/dev/ram ro console=ttyS0
> > > > earlycon=uart8250,mmio,0x10000000 efi=debug loglevel=8
> > > > memblock=debug" ## also panic by memtest
> > > > ```
> > > > 3. Then you will encounter the kernel panic logged in the above
> > > > link
> > > >
> > > > Other Information:
> > > >
> > > > 1. -------
> > > >
> > > > This report is not identical to my prior report -- "kernel
> > > > paniced when system hibernates" [1], but both of them
> > > > are closely related with the commit (3335068f8721 "riscv: Use
> > > > PUD/P4D/PGD pages for the linear mapping").
> > > >
> > > > With this commit, hibernation is trapped with "access fault"
> > > > while accessing the PMP-protected regions
> > > > (mmode_resv0@80000000)
> > > > from OpenSbi (BTW, hibernation is marked as nonportable by
> > > > Conor[2]).
> > > >
> > > > In this report, efi_init handoffs the memory mapping from Boot
> > > > Services to memblock where reserves mmode_resv0@80000000,
> > > > so there is no "access fault" but "page fault".
> > > >
> > > > And reverting commit 3335068f8721 indeed fixed this panic.
> > > >
> > > > 2. -------
> > > >
> > > > As the gdb-pt-dump [3] tool shows, the PTE which covered the
> > > > fault virtual address had the appropriate permission to store.
> > > > Is there another way to trigger the "Store/AMO page fault"? Or
> > > > the creation of linear mapping in commit 3335068f8721 did
> > > > something wrong?
> > > >
> > > > ```
> > > > (gdb) p/x $satp
> > > > $1 = 0xa000000000081708
> > > > (gdb) pt -satp 0xa000000000081708
> > > >              Address :     Length   Permissions
> > > >   0xff1bfffffea39000 :     0x1000 | W:1 X:0 R:1 S:1
> > > >   0xff1bfffffebf9000 :     0x1000 | W:1 X:0 R:1 S:1
> > > >   0xff1bfffffec00000 :   0x400000 | W:1 X:0 R:1 S:1
> > > >   0xff60000000000000 :   0x1c0000 | W:1 X:0 R:1 S:1
> > > >   0xff60000000200000 :   0xa00000 | W:0 X:0 R:1 S:1
> > > >   0xff60000000c00000 : 0x7f000000 | W:1 X:0 R:1 S:1  //
> > > > badaddr: ff6000007fdb1000
> > > >   0xff6000007fdc0000 :    0x3d000 | W:1 X:0 R:1 S:1
> > > >   0xff6000007ffbf000 :     0x1000 | W:1 X:0 R:1 S:1
> > > >   0xffffffff80000000 :   0xc00000 | W:0 X:1 R:1 S:1
> > > >   0xffffffff80c00000 :   0xa00000 | W:1 X:0 R:1 S:1
> > > >
> > > > ```
> > > >
> > > > 3. ------
> > > >
> > > > You can also reproduce similar panic by appending "memtest" in
> > > > kernel cmdline.
> > > > I have posted the memtest boot log at this link:
> > > > https://termbin.com/1twl.
> > > >
> > > > Please correct me if I'm wrong.
> > > >
> > > > [1]:
> > > > https://lore.kernel.org/linux-riscv/CAAYs2=gQvkhTeioMmqRDVGjdtNF_vhB+vm_1dHJxPNi75YDQ_Q@xxxxxxxxxxxxxx/
> > > > [2]:
> > > > https://lore.kernel.org/linux-riscv/20230526-astride-detonator-9ae120051159@wendy/
> > > > [3]: https://github.com/martinradev/gdb-pt-dump
> > >
> > > Thanks for the thorough report, really appreciated.
> > >
> > > So there are multiple issues here:
> > >
> > > - the first one is that the memory region for opensbi is marked
> > > as not
> > > cacheable in the efi memory map, and then this region is not
> > > mapped in
> > > the linear mapping:
> > > [    0.000000] efi:   0x000080000000-0x00008003ffff [Reserved   
> > > |   |
> > >  |  |  |  |  |  |  |  |   |  |  |  |UC]
> > >
>
> @Alex: The OpenSBI region is marked reserved because EDK2 chooses to
> do that explicitly as explained by Sunil.
> I don't think UC has to do anything with it. All the EFI memory
> regions are UC.
>
> > > - the second one (that I feel a bit ashamed of...) is that I did
> > > not
> > > check the alignment of the virtual address when choosing the map
> > > size
> > > in best_map_size() and then we end up trying to map a physical
> > > region
> > > aligned on 2MB that is actually not aligned on 2MB virtually
> > > because
> > > the opensbi region is not mapped at all.
> > >
> > > - the possible third one is that we should not map the linear
> > > mapping
> > > using 4K pages, this would be slow in my opinion, and I think we
> > > should waste a bit of memory to align va and pa on a 2MB
> > > boundary.
> > >
> > > So I'll fix the second issue, and possibly the third one, and if
> > > no
> > > one looks into why the opensbi region is mapped in UC, I'll take
> > > a
> > > look at edk2.
> > >
> > Hi Alex,
> >
> > EDK2 marks opensbi range as reserved memory in EFI map. According
> > to DT
> > spec, if the no-map is not set, we need to mark it as
> > EfiBootServicesData but EfiBootServicesData is actually considered
> > as
> > free memory in kernel, as per UEFI spec. To avoid kernel using this
> > memory, we deviated from the DT spec for opensbi ranges.
> >
>
> IMO, that should be the correct way unless we can change it to
> EfiRunServicesData/Code.
> Looking at U-Boot code, it sets the no-map region to
> EfiBootServicesData which may cause
> issues in RISC-V as well if we linear mapping sets up the initial
> 2MB.

Semantically I think no-map means the kernel should not be utilizing
it, so it should be EfiRunServicesData instead.

>
>
> > Let me know your thoughts how we can handle this better in EDK2
> > considering it has to support ACPI also.
> >
> > Thanks,
> > Sunil
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@xxxxxxxxxxxxxxxxxxx
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>
>
>