Re: Blank screen on boot of Linux 6.5 and later on Lenovo ThinkPad L570

From: Huacai Chen
Date: Fri Nov 03 2023 - 02:37:03 EST


Hi, Evan,

On Fri, Nov 3, 2023 at 1:54 PM Evan Preston <x.arch@xxxxxxxxxxxx> wrote:
>
> Hi Huacai,
>
> On 2023-11-02 Thu 08:38pm, Huacai Chen wrote:
> > Hi, Jaak,
> >
> > On Wed, Nov 1, 2023 at 7:52 PM Jaak Ristioja <jaak@xxxxxxxxxxx> wrote:
> > >
> > > On 31.10.23 14:17, Huacai Chen wrote:
> > > > Hi, Jaak and Evan,
> > > >
> > > > On Sun, Oct 29, 2023 at 9:42 AM Huacai Chen <chenhuacai@xxxxxxxxxx> wrote:
> > > >>
> > > >> On Sat, Oct 28, 2023 at 7:06 PM Jaak Ristioja <jaak@xxxxxxxxxxx> wrote:
> > > >>>
> > > >>> On 26.10.23 03:58, Huacai Chen wrote:
> > > >>>> Hi, Jaak,
> > > >>>>
> > > >>>> On Thu, Oct 26, 2023 at 2:49 AM Jaak Ristioja <jaak@xxxxxxxxxxx> wrote:
> > > >>>>>
> > > >>>>> On 25.10.23 16:23, Huacai Chen wrote:
> > > >>>>>> On Wed, Oct 25, 2023 at 6:08 PM Thorsten Leemhuis
> > > >>>>>> <regressions@xxxxxxxxxxxxx> wrote:
> > > >>>>>>>
> > > >>>>>>> Javier, Dave, Sima,
> > > >>>>>>>
> > > >>>>>>> On 23.10.23 00:54, Evan Preston wrote:
> > > >>>>>>>> On 2023-10-20 Fri 05:48pm, Huacai Chen wrote:
> > > >>>>>>>>> On Fri, Oct 20, 2023 at 5:35 PM Linux regression tracking (Thorsten
> > > >>>>>>>>> Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
> > > >>>>>>>>>> On 09.10.23 10:54, Huacai Chen wrote:
> > > >>>>>>>>>>> On Mon, Oct 9, 2023 at 4:45 PM Bagas Sanjaya <bagasdotme@xxxxxxxxx> wrote:
> > > >>>>>>>>>>>> On Mon, Oct 09, 2023 at 09:27:02AM +0800, Huacai Chen wrote:
> > > >>>>>>>>>>>>> On Tue, Sep 26, 2023 at 10:31 PM Huacai Chen <chenhuacai@xxxxxxxxxx> wrote:
> > > >>>>>>>>>>>>>> On Tue, Sep 26, 2023 at 7:15 PM Linux regression tracking (Thorsten
> > > >>>>>>>>>>>>>> Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
> > > >>>>>>>>>>>>>>> On 13.09.23 14:02, Jaak Ristioja wrote:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Upgrading to Linux 6.5 on a Lenovo ThinkPad L570 (Integrated Intel HD
> > > >>>>>>>>>>>>>>>> Graphics 620 (rev 02), Intel(R) Core(TM) i7-7500U) results in a blank
> > > >>>>>>>>>>>>>>>> screen after boot until the display manager starts... if it does start
> > > >>>>>>>>>>>>>>>> at all. Using the nomodeset kernel parameter seems to be a workaround.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> I've bisected this to commit 60aebc9559492cea6a9625f514a8041717e3a2e4
> > > >>>>>>>>>>>>>>>> ("drivers/firmware: Move sysfb_init() from device_initcall to
> > > >>>>>>>>>>>>>>>> subsys_initcall_sync").
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> As confirmed by Jaak, disabling DRM_SIMPLEDRM makes things work fine
> > > >>>>>>>>>>>>> again. So I guess the reason:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Well, this to me still looks a lot (please correct me if I'm wrong) like
> > > >>>>>>>>>> regression that should be fixed, as DRM_SIMPLEDRM was enabled beforehand
> > > >>>>>>>>>> if I understood things correctly. Or is there a proper fix for this
> > > >>>>>>>>>> already in the works and I just missed this? Or is there some good
> > > >>>>>>>>>> reason why this won't/can't be fixed?
> > > >>>>>>>>>
> > > >>>>>>>>> DRM_SIMPLEDRM was enabled but it didn't work at all because there was
> > > >>>>>>>>> no corresponding platform device. Now DRM_SIMPLEDRM works but it has a
> > > >>>>>>>>> blank screen. Of course it is valuable to investigate further about
> > > >>>>>>>>> DRM_SIMPLEDRM on Jaak's machine, but that needs Jaak's effort because
> > > >>>>>>>>> I don't have a same machine.
> > > >>>>>>>
> > > >>>>>>> Side note: Huacai, have you tried working with Jaak to get down to the
> > > >>>>>>> real problem? Evan, might you be able to help out here?
> > > >>>>>> No, Jaak has no response after he 'fixed' his problem by disabling SIMPLEDRM.
> > > >>>>>>
> > > >>>>>
> > > >>>>> I'm sorry, what was it exactly you want me to do? Please be mindful that
> > > >>>>> I'm not familiar with the internals of the Linux kernel and DRI, and it
> > > >>>>> might sometimes take weeks before I have time to work and respond on this.
> > > >>>> It doesn't matter. I hope you can do some experiments to investigate
> > > >>>> deeper. The first experiment you can do is enabling SIMPLEFB (i.e.
> > > >>>> CONFIG_FB_SIMPLE) instead of SIMPLEDRM (CONFIG_DRM_SIMPLEDRM) to see
> > > >>>> whether there is also a blank screen. If no blank screen, that
> > > >>>> probably means SIMPLEDRM has a bug, if still blank screen, that means
> > > >>>> the firmware may pass wrong screen information.
> > > >>>
> > > >>> Testing with 6.5.9 I get a blank screen with CONFIG_DRM_SIMPLEDRM=y and
> > > >>> get no blank screen with CONFIG_FB_SIMPLE=y and CONFIG_DRM_SIMPLEDRM unset.
> > > >> CONFIG_FB_SIMPLE and CONFIG_DRM_SIMPLEDRM use the same device created
> > > >> by sysfb_init(). Since FB_SIMPLE works fine, I think the real problem
> > > >> is that DRM_SIMPLEDRM has a bug. The next step is to enable
> > > >> CONFIG_DRM_SIMPLEDRM and trace its initialization. In detail, adding
> > > >> some printk() in simpledrm_probe() and its sub-routines to see where
> > > >> the driver fails. The output of these printk() can be seen by the
> > > >> 'dmesg' command after boot.
> > > > I need your help. I tried with my laptop (ThinkPad E490, Intel Core
> > > > i3-8145U, UHD Graphics 620) but I can't reproduce your problem. So
> > > > please patch your 6.5.x kernel with this temporary patch [1], then
> > > > build a "bad kernel" with SIMPLEDRM enabled. And after booting your
> > > > machine with this "bad kernel", please give me the dmesg output. Thank
> > > > you very much.
> > > >
> > > > [1] http://ddns.miaomiaomiao.top:9000/download/kernel/patch-6.5.9
> > >
> > > I'm unable to download it. Can you please send it by e-mail?
> > I'm sorry, please download from attachment.
>
> When applying this patch the first hunk (drivers/firmware/sysfb.c) fails for
> me with 6.5.9. Attempting to load the 6.5.9 kernel without this patch
> produces no dmesg output on my machine.
You copy-paste the patch? If you download it directly it can be
applied successfully, I think.

Huacai

>
> Evan
>
> >
> > Huacai
> >
> > >
> > > Jaak
> > >
> > > >
> > > >
> > > > Huacai
> > > >
> > > >>
> > > >> Huacai
> > > >>
> > > >>>
> > > >>> Jaak
> > > >>>
> > > >>>>
> > > >>>> Huacai
> > > >>>>
> > > >>>>>
> > > >>>>> Jaak
> > > >>>>>
> > > >>>>>>>
> > > >>>>>>> But I write this mail for a different reason:
> > > >>>>>>>
> > > >>>>>>>> I am having the same issue on a Lenovo Thinkpad P70 (Intel
> > > >>>>>>>> Corporation HD Graphics 530 (rev 06), Intel(R) Core(TM) i7-6700HQ).
> > > >>>>>>>> Upgrading from Linux 6.4.12 to 6.5 and later results in only a blank
> > > >>>>>>>> screen after boot and a rapidly flashing device-access-status
> > > >>>>>>>> indicator.
> > > >>>>>>>
> > > >>>>>>> This additional report makes me wonder if we should revert the culprit
> > > >>>>>>> (60aebc9559492c ("drivers/firmware: Move sysfb_init() from
> > > >>>>>>> device_initcall to subsys_initcall_sync") [v6.5-rc1]). But I guess that
> > > >>>>>>> might lead to regressions for some users? But the patch description says
> > > >>>>>>> that this is not a common configuration, so can we maybe get away with that?
> > > >>>>>> From my point of view, this is not a regression, 60aebc9559492c
> > > >>>>>> doesn't cause a problem, but exposes a problem. So we need to fix the
> > > >>>>>> real problem (SIMPLEDRM has a blank screen on some conditions). This
> > > >>>>>> needs Jaak or Evan's help.
> > > >>>>>>
> > > >>>>>> Huacai
> > > >>>>>>>
> > > >>>>>>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> > > >>>>>>> --
> > > >>>>>>> Everything you wanna know about Linux kernel regression tracking:
> > > >>>>>>> https://linux-regtracking.leemhuis.info/about/#tldr
> > > >>>>>>> If I did something stupid, please tell me, as explained on that page.
> > > >>>>>>>
> > > >>>>>>>>>>>>> When SIMPLEDRM takes over the framebuffer, the screen is blank (don't
> > > >>>>>>>>>>>>> know why). And before 60aebc9559492cea6a9625f ("drivers/firmware: Move
> > > >>>>>>>>>>>>> sysfb_init() from device_initcall to subsys_initcall_sync") there is
> > > >>>>>>>>>>>>> no platform device created for SIMPLEDRM at early stage, so it seems
> > > >>>>>>>>>>>>> also "no problem".
> > > >>>>>>>>>>>> I don't understand above. You mean that after that commit the platform
> > > >>>>>>>>>>>> device is also none, right?
> > > >>>>>>>>>>> No. The SIMPLEDRM driver needs a platform device to work, and that
> > > >>>>>>>>>>> commit makes the platform device created earlier. So, before that
> > > >>>>>>>>>>> commit, SIMPLEDRM doesn't work, but the screen isn't blank; after that
> > > >>>>>>>>>>> commit, SIMPLEDRM works, but the screen is blank.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Huacai
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Confused...
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> --
> > > >>>>>>>>>>>> An old man doll... just what I always wanted! - Clara
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>
> > > >>>
> > >
>
>