Re: Regression from "ACPI: OSI: Remove Linux-Dell-Video _OSI string"? (was: Re: Bug#1036530: linux-signed-amd64: Hard lock up of system)

From: Nick Hastings
Date: Mon Jun 26 2023 - 18:34:11 EST


Hi Thorsten,

* Linux regression tracking (Thorsten Leemhuis) <regressions@xxxxxxxxxxxxx> [230626 21:09]:
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> Nick, what's the status/was there any progress? Did you do what Mario
> suggested and file a nouveau bug?

It was not apparent that the suggestion to open "a Nouveau drm bug" was
addressed to me.

> I ask, as I still have this on my list of regressions and it seems there
> was no progress in three+ weeks now.

I have not pursued this further since as far as I could tell I already
provided all requested information and I don't actually use nouveau, so
I blacklisted it.

Regards,

Nick.

> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> #regzbot backburner: slow progress, likely just affects one machine
> #regzbot poke
>
>
> On 02.06.23 02:57, Limonciello, Mario wrote:
> > [AMD Official Use Only - General]
> >
> >> -----Original Message-----
> >> From: Nick Hastings <nicholaschastings@xxxxxxxxx>
> >> Sent: Thursday, June 1, 2023 7:02 PM
> >> To: Karol Herbst <kherbst@xxxxxxxxxx>
> >> Cc: Limonciello, Mario <Mario.Limonciello@xxxxxxx>; Lyude Paul
> >> <lyude@xxxxxxxxxx>; Lukas Wunner <lukas@xxxxxxxxx>; Salvatore
> >> Bonaccorso <carnil@xxxxxxxxxx>; 1036530@xxxxxxxxxxxxxxx; Rafael J.
> >> Wysocki <rafael@xxxxxxxxxx>; Len Brown <lenb@xxxxxxxxxx>; linux-
> >> acpi@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> >> regressions@xxxxxxxxxxxxxxx
> >> Subject: Re: Regression from "ACPI: OSI: Remove Linux-Dell-Video _OSI
> >> string"? (was: Re: Bug#1036530: linux-signed-amd64: Hard lock up of system)
> >>
> >> Hi,
> >>
> >> * Karol Herbst <kherbst@xxxxxxxxxx> [230602 03:10]:
> >>> On Thu, Jun 1, 2023 at 7:21 PM Limonciello, Mario
> >>> <Mario.Limonciello@xxxxxxx> wrote:
> >>>>> -----Original Message-----
> >>>>> From: Karol Herbst <kherbst@xxxxxxxxxx>
> >>>>> Sent: Thursday, June 1, 2023 12:19 PM
> >>>>> To: Limonciello, Mario <Mario.Limonciello@xxxxxxx>
> >>>>> Cc: Nick Hastings <nicholaschastings@xxxxxxxxx>; Lyude Paul
> >>>>> <lyude@xxxxxxxxxx>; Lukas Wunner <lukas@xxxxxxxxx>; Salvatore
> >>>>> Bonaccorso <carnil@xxxxxxxxxx>; 1036530@xxxxxxxxxxxxxxx; Rafael J.
> >>>>> Wysocki <rafael@xxxxxxxxxx>; Len Brown <lenb@xxxxxxxxxx>; linux-
> >>>>> acpi@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> >>>>> regressions@xxxxxxxxxxxxxxx
> >>>>> Subject: Re: Regression from "ACPI: OSI: Remove Linux-Dell-Video _OSI
> >>>>> string"? (was: Re: Bug#1036530: linux-signed-amd64: Hard lock up of
> >> system)
> >>>>>
> >>>>> On Thu, Jun 1, 2023 at 6:54 PM Limonciello, Mario
> >>>>> <Mario.Limonciello@xxxxxxx> wrote:
> >>>>>>
> >>>>>> [AMD Official Use Only - General]
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Karol Herbst <kherbst@xxxxxxxxxx>
> >>>>>>> Sent: Thursday, June 1, 2023 11:33 AM
> >>>>>>> To: Limonciello, Mario <Mario.Limonciello@xxxxxxx>
> >>>>>>> Cc: Nick Hastings <nicholaschastings@xxxxxxxxx>; Lyude Paul
> >>>>>>> <lyude@xxxxxxxxxx>; Lukas Wunner <lukas@xxxxxxxxx>; Salvatore
> >>>>>>> Bonaccorso <carnil@xxxxxxxxxx>; 1036530@xxxxxxxxxxxxxxx; Rafael
> >> J.
> >>>>>>> Wysocki <rafael@xxxxxxxxxx>; Len Brown <lenb@xxxxxxxxxx>; linux-
> >>>>>>> acpi@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> >>>>>>> regressions@xxxxxxxxxxxxxxx
> >>>>>>> Subject: Re: Regression from "ACPI: OSI: Remove Linux-Dell-Video
> >> _OSI
> >>>>>>> string"? (was: Re: Bug#1036530: linux-signed-amd64: Hard lock up of
> >>>>> system)
> >>>>>>>
> >>>>>>> On Thu, Jun 1, 2023 at 6:18 PM Limonciello, Mario
> >>>>>>>>
> >>>>>>>> Lyude, Lukas, Karol
> >>>>>>>>
> >>>>>>>> This thread is in relation to this commit:
> >>>>>>>>
> >>>>>>>> 24867516f06d ("ACPI: OSI: Remove Linux-Dell-Video _OSI string")
> >>>>>>>>
> >>>>>>>> Nick has found that runtime PM is *not* working for nouveau.
> >>>>>>>>
> >>>>>>>
> >>>>>>> keep in mind we have a list of PCIe controllers where we apply a
> >>>>>>> workaround:
> >>>>>>>
> >>>>>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers
> >>>>>>> /gpu/drm/nouveau/nouveau_drm.c?h=v6.4-rc4#n682
> >>>>>>>
> >>>>>>> And I suspect there might be one or two more IDs we'll have to add
> >>>>>>> there. Do we have any logs?
> >>>>>>
> >>>>>> There's some archived onto the distro bug. Search this page for
> >>>>> "journalctl.log.gz"
> >>>>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1036530
> >>>>>>
> >>>>>
> >>>>> interesting.. It seems to be the same controller used here. I wonder
> >>>>> if the pci topology is different or if the workaround is applied at
> >>>>> all.
> >>>>
> >>>> I didn't see the message in the log about the workaround being applied
> >>>> in that log, so I guess PCI topology difference is a likely suspect.
> >>>>
> >>>
> >>> yeah, but I also couldn't see a log with the usual nouveau messages,
> >>> so it's kinda weird.
> >>>
> >>> Anyway, the output of `lspci -tvnn` would help
> >>
> >> % lspci -tvnn
> >> -[0000:00]-+-00.0 Intel Corporation Device [8086:3e20]
> >> +-01.0-[01]----00.0 NVIDIA Corporation TU117M [GeForce GTX 1650
> >> Mobile / Max-Q] [10de:1f91]
> >
> > So the bridge it's connected to is the same that the quirk *should have been* triggering.
> >
> > May 29 15:02:42 xps kernel: pci 0000:00:01.0: [8086:1901] type 01 class 0x060400
> >
> > Since the quirk isn't working and this is still a problem in 6.4-rc4 I suggest opening a
> > Nouveau drm bug to figure out why.
> >
> >> +-02.0 Intel Corporation CoffeeLake-H GT2 [UHD Graphics 630]
> >> [8086:3e9b]
> >> +-04.0 Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core
> >> Processor Thermal Subsystem [8086:1903]
> >> +-08.0 Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 /
> >> 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911]
> >> +-12.0 Intel Corporation Cannon Lake PCH Thermal Controller
> >> [8086:a379]
> >> +-14.0 Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller
> >> [8086:a36d]
> >> +-14.2 Intel Corporation Cannon Lake PCH Shared SRAM [8086:a36f]
> >> +-15.0 Intel Corporation Cannon Lake PCH Serial IO I2C Controller #0
> >> [8086:a368]
> >> +-15.1 Intel Corporation Cannon Lake PCH Serial IO I2C Controller #1
> >> [8086:a369]
> >> +-16.0 Intel Corporation Cannon Lake PCH HECI Controller [8086:a360]
> >> +-17.0 Intel Corporation Cannon Lake Mobile PCH SATA AHCI Controller
> >> [8086:a353]
> >> +-1b.0-[02-3a]----00.0-[03-3a]--+-00.0-[04]----00.0 Intel Corporation
> >> JHL6340 Thunderbolt 3 NHI (C step) [Alpine Ridge 2C 2016] [8086:15d9]
> >> | +-01.0-[05-39]--
> >> | \-02.0-[3a]----00.0 Intel Corporation JHL6340
> >> Thunderbolt 3 USB 3.1 Controller (C step) [Alpine Ridge 2C 2016]
> >> [8086:15db]
> >> +-1c.0-[3b]----00.0 Intel Corporation Wi-Fi 6 AX200 [8086:2723]
> >> +-1c.4-[3c]----00.0 Realtek Semiconductor Co., Ltd. RTS525A PCI
> >> Express Card Reader [10ec:525a]
> >> +-1d.0-[3d]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller
> >> SM981/PM981/PM983 [144d:a808]
> >> +-1f.0 Intel Corporation Cannon Lake LPC Controller [8086:a30e]
> >> +-1f.3 Intel Corporation Cannon Lake PCH cAVS [8086:a348]
> >> +-1f.4 Intel Corporation Cannon Lake PCH SMBus Controller
> >> [8086:a323]
> >> \-1f.5 Intel Corporation Cannon Lake PCH SPI Controller
> >> [8086:a324]
> >>
> >>
> >> Regards,
> >>
> >> Nick.
> >