Re: [BUG] rtc-efi: Error in efi.get_time() spams dmesg with error message

From: Alexandre Belloni
Date: Wed Nov 09 2022 - 15:31:50 EST


On 09/11/2022 14:51:19+0000, Alexandru Elisei wrote:
> Hi,
>
> On Tue, Nov 08, 2022 at 10:41:18PM +0100, Alexandre Belloni wrote:
> > On 08/11/2022 10:55:15+0000, Alexandru Elisei wrote:
> > > Hi,
> > >
> > > Commit d3549a938b73 ("efi/arm64: libstub: avoid SetVirtualAddressMap() when
> > > possible") exposed a firmware error on an Ampere Altra machine that was
> > > causing the machine to panic. Then commit 23715a26c8d8 ("arm64: efi:
> > > Recover from synchronous exceptions occurring in firmware") made the EFI
> > > exception non-fatal, and disabled runtime services when the exception
> > > happens. The interaction between those two patches are being discussed in a
> > > separate thread [1], but that should be orthogonal to this.
> > >
> > > Now efi.get_time() fails and each time an error message is printed to
> > > dmesg, which happens several times a second and clutters dmesg
> > > unnecessarily, to the point it becomes unusable.
> > >
> > > I was wondering if it would be possible to turn dev_err() into a
> > > dev_WARN_ONCE() or do something to avoid this issue. Tried to replace
> > > dev_err() with dev_err_ratelimited(), and the error message was displayed
> > > less often (about once per second), but dmesg was still being cluttered.
> > >
> >
> > The question this raise is what is actually trying to read the RTC this
> > often?
> >
> > This should be read once at boot and maybe every time you wake up from
> > suspend but there is no real reason to read it multiple times per
> > seconds.
>
> Reverted the commit the exposed the firmware bug, which means rtc-efi works as
> it should. Added these debug statements to check how many times efi_read_time()
> is called if there are no errors:
>
> --- a/drivers/rtc/rtc-efi.c
> +++ b/drivers/rtc/rtc-efi.c
> @@ -154,6 +154,7 @@ static int efi_set_alarm(struct device *dev, struct rtc_wkalrm *wkalrm)
> return status == EFI_SUCCESS ? 0 : -EINVAL;
> }
>
> +static unsigned long i = 0;
> static int efi_read_time(struct device *dev, struct rtc_time *tm)
> {
> efi_status_t status;
> @@ -162,6 +163,9 @@ static int efi_read_time(struct device *dev, struct rtc_time *tm)
>
> status = efi.get_time(&eft, &cap);
>
> + i++;
> + pr_info("%s: Call number %lu\n", __func__, i);
> +
> if (status != EFI_SUCCESS) {
> /* should never happen */
> dev_err(dev, "can't read time\n");
>
> The function gets called 3 times, twice during boot and once after. I would say
> that efi_read_time() gets called so many times because it fails.
>

It should really get called only once, at device registration when
CONFIG_RTC_HCTOSYS is set (which I despise):
https://elixir.bootlin.com/linux/latest/source/drivers/rtc/class.c#L431

Could you maybe use dump_stack() ?


--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com