Re: Blank screen on boot of Linux 6.5 and later on Lenovo ThinkPad L570

From: Jaak Ristioja
Date: Sun Nov 05 2023 - 11:32:09 EST


On 05.11.23 14:40, Huacai Chen wrote:
Hi, Evan,

On Sat, Nov 4, 2023 at 10:50 AM Evan Preston <x.arch@xxxxxxxxxxxx> wrote:

Hi Huacai,

On 2023-11-03 Fri 02:36pm, Huacai Chen wrote:
Hi, Evan,

On Fri, Nov 3, 2023 at 1:54 PM Evan Preston <x.arch@xxxxxxxxxxxx> wrote:

Hi Huacai,

On 2023-11-02 Thu 08:38pm, Huacai Chen wrote:
Hi, Jaak,

On Wed, Nov 1, 2023 at 7:52 PM Jaak Ristioja <jaak@xxxxxxxxxxx> wrote:

On 31.10.23 14:17, Huacai Chen wrote:
Hi, Jaak and Evan,

On Sun, Oct 29, 2023 at 9:42 AM Huacai Chen <chenhuacai@xxxxxxxxxx> wrote:

On Sat, Oct 28, 2023 at 7:06 PM Jaak Ristioja <jaak@xxxxxxxxxxx> wrote:

On 26.10.23 03:58, Huacai Chen wrote:
Hi, Jaak,

On Thu, Oct 26, 2023 at 2:49 AM Jaak Ristioja <jaak@xxxxxxxxxxx> wrote:

On 25.10.23 16:23, Huacai Chen wrote:
On Wed, Oct 25, 2023 at 6:08 PM Thorsten Leemhuis
<regressions@xxxxxxxxxxxxx> wrote:

Javier, Dave, Sima,

On 23.10.23 00:54, Evan Preston wrote:
On 2023-10-20 Fri 05:48pm, Huacai Chen wrote:
On Fri, Oct 20, 2023 at 5:35 PM Linux regression tracking (Thorsten
Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
On 09.10.23 10:54, Huacai Chen wrote:
On Mon, Oct 9, 2023 at 4:45 PM Bagas Sanjaya <bagasdotme@xxxxxxxxx> wrote:
On Mon, Oct 09, 2023 at 09:27:02AM +0800, Huacai Chen wrote:
On Tue, Sep 26, 2023 at 10:31 PM Huacai Chen <chenhuacai@xxxxxxxxxx> wrote:
On Tue, Sep 26, 2023 at 7:15 PM Linux regression tracking (Thorsten
Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
On 13.09.23 14:02, Jaak Ristioja wrote:

Upgrading to Linux 6.5 on a Lenovo ThinkPad L570 (Integrated Intel HD
Graphics 620 (rev 02), Intel(R) Core(TM) i7-7500U) results in a blank
screen after boot until the display manager starts... if it does start
at all. Using the nomodeset kernel parameter seems to be a workaround.

I've bisected this to commit 60aebc9559492cea6a9625f514a8041717e3a2e4
("drivers/firmware: Move sysfb_init() from device_initcall to
subsys_initcall_sync").

As confirmed by Jaak, disabling DRM_SIMPLEDRM makes things work fine
again. So I guess the reason:

Well, this to me still looks a lot (please correct me if I'm wrong) like
regression that should be fixed, as DRM_SIMPLEDRM was enabled beforehand
if I understood things correctly. Or is there a proper fix for this
already in the works and I just missed this? Or is there some good
reason why this won't/can't be fixed?

DRM_SIMPLEDRM was enabled but it didn't work at all because there was
no corresponding platform device. Now DRM_SIMPLEDRM works but it has a
blank screen. Of course it is valuable to investigate further about
DRM_SIMPLEDRM on Jaak's machine, but that needs Jaak's effort because
I don't have a same machine.

Side note: Huacai, have you tried working with Jaak to get down to the
real problem? Evan, might you be able to help out here?
No, Jaak has no response after he 'fixed' his problem by disabling SIMPLEDRM.


I'm sorry, what was it exactly you want me to do? Please be mindful that
I'm not familiar with the internals of the Linux kernel and DRI, and it
might sometimes take weeks before I have time to work and respond on this.
It doesn't matter. I hope you can do some experiments to investigate
deeper. The first experiment you can do is enabling SIMPLEFB (i.e.
CONFIG_FB_SIMPLE) instead of SIMPLEDRM (CONFIG_DRM_SIMPLEDRM) to see
whether there is also a blank screen. If no blank screen, that
probably means SIMPLEDRM has a bug, if still blank screen, that means
the firmware may pass wrong screen information.

Testing with 6.5.9 I get a blank screen with CONFIG_DRM_SIMPLEDRM=y and
get no blank screen with CONFIG_FB_SIMPLE=y and CONFIG_DRM_SIMPLEDRM unset.
CONFIG_FB_SIMPLE and CONFIG_DRM_SIMPLEDRM use the same device created
by sysfb_init(). Since FB_SIMPLE works fine, I think the real problem
is that DRM_SIMPLEDRM has a bug. The next step is to enable
CONFIG_DRM_SIMPLEDRM and trace its initialization. In detail, adding
some printk() in simpledrm_probe() and its sub-routines to see where
the driver fails. The output of these printk() can be seen by the
'dmesg' command after boot.
I need your help. I tried with my laptop (ThinkPad E490, Intel Core
i3-8145U, UHD Graphics 620) but I can't reproduce your problem. So
please patch your 6.5.x kernel with this temporary patch [1], then
build a "bad kernel" with SIMPLEDRM enabled. And after booting your
machine with this "bad kernel", please give me the dmesg output. Thank
you very much.

[1] http://ddns.miaomiaomiao.top:9000/download/kernel/patch-6.5.9

I'm unable to download it. Can you please send it by e-mail?
I'm sorry, please download from attachment.

When applying this patch the first hunk (drivers/firmware/sysfb.c) fails for
me with 6.5.9. Attempting to load the 6.5.9 kernel without this patch
produces no dmesg output on my machine.
You copy-paste the patch? If you download it directly it can be
applied successfully, I think.

The patch downloaded from your URL applies successfully. However, I still
see no dmesg output using the patched 6.5.9 kernel. 'journalctl -k -b all'
shows no dmesg output from any 6.5.x boots, only from 6.4.12 boots.
Thank you for your testing. Since you cannot boot to GUI successfully
as Jaak, you may have some troubles with getting the dmesg output. But
you can try to use "systemd.unit=multi-user.target" boot parameters.
In this way you may boot to the login: prompt and then you can get
dmesg output. Or if you still fail, you may use 'jornalctl -k -b -1'
to get the previous dmesg output with 6.4.12.

Hi, Jaak,

Have you tested? I think you can successfully get a dmesg output with my patch.

Yes, just tested it, here I think are the relevant parts from a dmesg produced with CONFIG_DRM_SIMPLEDRM and the patch provided by Huacai:

...
[ 2.909625] sysfb 1
[ 2.909627] sysfb 2
...
[ 2.951477] ACPI: bus type drm_connector registered
[ 2.952096] i915 0000:00:02.0: [drm] VT-d active for gfx access
[ 2.952105] resource: resource sanity check: requesting [mem 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB [mem 0xe0000000-0xe012bfff]
[ 2.952111] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple BARs
[ 2.952138] i915 0000:00:02.0: [drm] Using Transparent Hugepages
[ 2.953204] Loading firmware: i915/kbl_dmc_ver1_04.bin
[ 2.953485] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4)
...
[ 4.142075] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[ 4.144269] ACPI: video: Video Device [GFX0] (multi-head: yes rom: no post: no)
[ 4.144414] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4
[ 4.144580] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1
[ 4.144590] usbcore: registered new interface driver udl
[ 4.144603] T: probe 1
[ 4.144605] T: create 1
[ 4.144610] T: create 2
[ 4.144611] T: create 3a-1
[ 4.144613] T: create 3a-2
[ 4.144614] T: create 3a-3
[ 4.144616] T: create 3a-4
[ 4.144618] T: create 4
[ 4.144619] T: create 5
[ 4.144621] simple-framebuffer simple-framebuffer.0: [drm] display mode={"": 60 18432 640 640 640 640 480 480 480 480 0x40 0x0}
[ 4.144628] simple-framebuffer simple-framebuffer.0: [drm] framebuffer format=XR24 little-endian (0x34325258), size=640x480, stride=2560 byte
[ 4.144633] T: create 6b-1
[ 4.144635] T: create 6b-2
[ 4.144637] simple-framebuffer simple-framebuffer.0: [drm] using I/O memory framebuffer at [mem 0xe0000000-0xe012bfff flags 0x200]
[ 4.144643] T: create 6b-3
[ 4.144660] T: create 6b-4
[ 4.144662] T: create 7
[ 4.144673] T: create 8
[ 4.144676] T: create 9
[ 4.144678] T: create 10
[ 4.144681] T: create 11
[ 4.144685] T: create 12
[ 4.144689] T: probe 2
[ 4.144728] [drm] Initialized simpledrm 1.0.0 20200625 for simple-framebuffer.0 on minor 2
[ 4.144732] T: probe 3
[ 4.145905] Console: switching to colour frame buffer device 80x30
[ 4.150437] simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrmdrmfb frame buffer device
[ 4.150766] T: probe 4
[ 4.151218] loop: module loaded
[ 4.154434] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer device
...
[ 44.630789] simple-framebuffer simple-framebuffer.0: swiotlb buffer is full (sz: 1310720 bytes), total 32768 (slots), used 0 (slots)
...

The last message might be due to the display manager starting up.

Hope it helps.

J



Evan


Huacai


Evan


Huacai


Jaak



Huacai


Huacai


Jaak


Huacai


Jaak


But I write this mail for a different reason:

I am having the same issue on a Lenovo Thinkpad P70 (Intel
Corporation HD Graphics 530 (rev 06), Intel(R) Core(TM) i7-6700HQ).
Upgrading from Linux 6.4.12 to 6.5 and later results in only a blank
screen after boot and a rapidly flashing device-access-status
indicator.

This additional report makes me wonder if we should revert the culprit
(60aebc9559492c ("drivers/firmware: Move sysfb_init() from
device_initcall to subsys_initcall_sync") [v6.5-rc1]). But I guess that
might lead to regressions for some users? But the patch description says
that this is not a common configuration, so can we maybe get away with that?
From my point of view, this is not a regression, 60aebc9559492c
doesn't cause a problem, but exposes a problem. So we need to fix the
real problem (SIMPLEDRM has a blank screen on some conditions). This
needs Jaak or Evan's help.

Huacai

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

When SIMPLEDRM takes over the framebuffer, the screen is blank (don't
know why). And before 60aebc9559492cea6a9625f ("drivers/firmware: Move
sysfb_init() from device_initcall to subsys_initcall_sync") there is
no platform device created for SIMPLEDRM at early stage, so it seems
also "no problem".
I don't understand above. You mean that after that commit the platform
device is also none, right?
No. The SIMPLEDRM driver needs a platform device to work, and that
commit makes the platform device created earlier. So, before that
commit, SIMPLEDRM doesn't work, but the screen isn't blank; after that
commit, SIMPLEDRM works, but the screen is blank.

Huacai

Confused...

--
An old man doll... just what I always wanted! - Clara