Re: [PATCH] irqchip/gic-v3: Check SRE bit for GICv2 legacy support

From: Ard Biesheuvel
Date: Mon Nov 30 2020 - 13:32:41 EST


On Mon, 30 Nov 2020 at 17:22, Shameerali Kolothum Thodi
<shameerali.kolothum.thodi@xxxxxxxxxx> wrote:
>
>
>
> > -----Original Message-----
> > From: Marc Zyngier [mailto:maz@xxxxxxxxxx]
> > Sent: 30 November 2020 14:57
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> > Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> > eric.auger@xxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>
> > Subject: Re: [PATCH] irqchip/gic-v3: Check SRE bit for GICv2 legacy support
> >
> > Hi Shameer,
> >
> > On 2020-11-30 13:55, Shameerali Kolothum Thodi wrote:
> > > Hi Marc,
> > >
> > >> -----Original Message-----
> > >> From: Marc Zyngier [mailto:maz@xxxxxxxxxx]
> > >> Sent: 30 November 2020 12:28
> > >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> > >> Cc: linux-kernel@xxxxxxxxxxxxxxx;
> > >> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> > >> eric.auger@xxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>
> > >> Subject: Re: [PATCH] irqchip/gic-v3: Check SRE bit for GICv2 legacy
> > >> support
> > >>
> > >> Hi Shameer,
> > >>
> > >> On 2020-11-30 10:26, Shameer Kolothum wrote:
> > >> > At present, the support for GICv2 backward compatibility on GICv3/v4
> > >> > hardware is determined based on whether DT/ACPI provides a memory
> > >> > mapped phys base address for GIC virtual CPU interface register(GICV).
> > >> > This creates a problem that a Qemu guest boot with default GIC(GICv2)
> > >>
> > >> That'd be true of *any* guest using GICv2, not just when using QEMU as
> > >> the VMM, right?
> > >
> > > Yes, I would think so.
> > >
> > >> > hangs when firmware falsely reports this address on systems that don't
> > >> > have support for legacy mode.
> > >>
> > >> And I guess it isn't just the guest that hangs, but the whole system
> > >> can
> > >> go south (it would be totally legitimate for the HW to deliver a
> > >> SError).
> > >
> > > So far I haven’t seen that happening. I was able to kill the Guest and
> > > recover.
> > > But the annoying thing is Guest boot hangs at random places without any
> > > error reported and people end up spending lot of time only to be told
> > > later
> > > that gic-version=3 is missing from their scripts.
> >
> > That's pretty lucky. The guest has been reading/writing to random
> > places,
> > and depending on where this maps in the physical space, anything can
> > happen. Out of (morbid) curiosity, what is at the address pointed to by
> > GICC in MADT?
>
> This is what it reports,
>
> [02Ch 0044 1] Subtable Type : 0B [Generic Interrupt Controller]
> [02Dh 0045 1] Length : 50
> ...
> [04Ch 0076 8] Base Address : 000000009B000000
> [054h 0084 8] Virtual GIC Base Address : 000000009B020000
> [05Ch 0092 8] Hypervisor GIC Base Address : 000000009B010000
> [064h 0100 4] Virtual GIC Interrupt : 00000019
> [068h 0104 8] Redistributor Base Address : 00000000AE100000
> [070h 0112 8] ARM MPIDR : 0000000000080000
> [078h 0120 1] Efficiency Class : 15
> [079h 0121 3] Reserved : 001500
>
> > >
> > >> > As per GICv3/v4 spec, in an implementation that does not support legacy
> > >> > operation, affinity routing and system register access are permanently
> > >> > enabled. This means that the associated control bits are RAO/WI. Hence
> > >> > use the ICC_SRE_EL1.SRE bit to decide whether hardware supports
> > GICv2
> > >> > mode in addition to the above firmware based check.
> > >> >
> > >> > Signed-off-by: Shameer Kolothum
> > <shameerali.kolothum.thodi@xxxxxxxxxx>
> > >> > ---
> > >> > On Hisilicon D06, UEFI sets the GIC MADT GICC gicv_base_address but
> > the
> > >> > GIC implementation on these boards doesn't have the GICv2 legacy
> > >> > support.
> > >> > This results in, Guest boot hang when Qemu uses the default GIC option.
> > >>
> > >> What a bore. Is this glorious firmware really out in the wild?
> > >
> > > :(. I am afraid it is.
> >
> > Meh. We'll have to paper over it then. How urgent is that?
>
> It is not that urgent urgent but 5.10 support would be nice :)
>
> >
> > [...]
> >
> > >> How about this instead? Completely untested, of course.
> > >
> > > Thanks for that. I just tested and it works.
> >
> > OK. I'll rework it a bit and post it as a complete patch. Is there an
> > erratum number on your side?
>
> Sure. I am not sure on erratum, but will check internally and get back to you
> if there is one.
>

Any clue why production D06 firmware deviates from the D06 port that
exists in Tianocore's edk2-platforms repository? Because that version
does not have this bug, and I wonder why that code was upstreamed at
all if a substantially different version gets shipped with production
hardware.