Re: [CRASH][BISECTED] 6.4.1 crash in boot

From: Mirsad Todorovac
Date: Wed Jul 05 2023 - 01:19:17 EST


On 7/5/23 04:09, Kees Cook wrote:
On July 4, 2023 4:15:20 PM PDT, Mirsad Todorovac <mirsad.todorovac@xxxxxxxxxxxx> wrote:
On 7/4/23 23:36, Kees Cook wrote:
On July 4, 2023 10:20:11 AM PDT, Mirsad Todorovac <mirsad.todorovac@xxxxxxxxxxxx> wrote:
On 7/4/23 01:09, Kees Cook wrote:> On Mon, Jul 03, 2023 at 12:03:23PM -0700, Kees Cook wrote:
Cool. xhci-hub is in your backtrace, and the above patch was made for
something very similar (though, again, I don't see why you're getting a
_crash_, it should _warn_ and continue normally). And, actually, also
include this patch:
https://lore.kernel.org/lkml/20230614181307.gonna.256-kees@xxxxxxxxxx/

This is now in Linus's tree:
09b69dd4378b ("usb: ch9: Replace 1-element array with flexible array")

Please also still try with the first patch I mentioned, which is very similar:
https://lore.kernel.org/lkml/20230629190900.never.787-kees@xxxxxxxxxx/

Hi,

I have finally built w both patches (and recommended PSTORE settings were
default already).

Were you able to find the crashes saved by pstore?

No, only lktdm and invalid opcode crashes ...

P.S.

Actually, I have recovered some pstore records. Please find them in the attachment:

This second patch fixes the booting problem, but alas there is still a problem -

Ah! That's great! They're is still an unexpected crash source, but the trigger is fixed.

Glad I could be of help.

all Wayland and X11.org GUI applications fail to start, with errors like this one:

Jul 4 19:09:07 defiant kernel: [ 40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI

Hmm, is CONFIG_UBSAN_TRAP set?

marvin@defiant:~/linux/kernel/linux_torvalds$ grep CONFIG_UBSAN_TRAP .config
CONFIG_UBSAN_TRAP=y
Ah-ha! Turn that off please. With it off you will get much more useful reports from USBAN.

Will do that. Thanks for the hint.

marvin@defiant:~/linux/kernel/linux_torvalds$

Jul 4 19:09:07 defiant kernel: [ 40.529726] RIP: 0010:alloc_pid+0x46c/0x480

Hmm, is this patch in your kernel?
https://git.kernel.org/linus/b69f0aeb068980af983d399deafc7477cec8bc04

No, it wasn't. I had only these:

marvin@defiant:~/linux/kernel/linux_torvalds$ more ../kees-[12].patch
::::::::::::::
../kees-1.patch
::::::::::::::
diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
index b17e3a21b15f..82ec6af71a1d 100644
--- a/include/uapi/linux/usb/ch9.h
+++ b/include/uapi/linux/usb/ch9.h
@@ -376,7 +376,10 @@ struct usb_string_descriptor {
__u8 bLength;
__u8 bDescriptorType;
- __le16 wData[1]; /* UTF-16LE encoded */
+ union {
+ __le16 legacy_padding;
+ __DECLARE_FLEX_ARRAY(__le16, wData); /* UTF-16LE encoded */
+ };
} __attribute__ ((packed));
/* note that "string" zero is special, it holds language codes that
::::::::::::::
../kees-2.patch
::::::::::::::
diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
index b17e3a21b15f..3ff98c7ba7e3 100644
--- a/include/uapi/linux/usb/ch9.h
+++ b/include/uapi/linux/usb/ch9.h
@@ -981,7 +981,11 @@ struct usb_ssp_cap_descriptor {
#define USB_SSP_MIN_RX_LANE_COUNT (0xf << 8)
#define USB_SSP_MIN_TX_LANE_COUNT (0xf << 12)
__le16 wReserved;
- __le32 bmSublinkSpeedAttr[1]; /* list of sublink speed attrib entries */
+ union {
+ __le32 legacy_padding;
+ /* list of sublink speed attrib entries */
+ __DECLARE_FLEX_ARRAY(__le32, bmSublinkSpeedAttr);
+ };
#define USB_SSP_SUBLINK_SPEED_SSID (0xf) /* sublink speed ID */
#define USB_SSP_SUBLINK_SPEED_LSE (0x3 << 4) /* Lanespeed exponent */
#define USB_SSP_SUBLINK_SPEED_LSE_BPS 0
marvin@defiant:~/linux/kernel/linux_torvalds$

---------------------------------------------------------

Now it works. Succeeded boot and running of X apps with the new git pull
torvalds tree and the kees-2.patch.

Perfect! Okay, so it looks like all the issues are known and fixed. I'll work with Greg to get the other ch9 patch landed.

Yes, maybe it should be tested more widely first. It was an unobvious bug and
I couldn't see what went wrong ...

Praise God!

This is the git log --oneline:

d528014517f2 (HEAD, origin/master, origin/HEAD) Revert ".gitignore: ignore *.cover and *.mbx"
04f2933d375e Merge tag 'core_guards_for_6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue
03275585cabd afs: Fix accidental truncation when storing data
538140ca602b Merge tag 'ovl-update-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs
94c76955e86a Merge tag 'gfs2-v6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
ccf46d853183 Merge tag 'pm-6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
b869e9f49964 Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
406fb9eb198a Merge tag 'firewire-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
f1962207150c module: fix init_module_from_file() error handling
40c565a429d7 Merge branches 'pm-cpufreq' and 'pm-cpuidle'
f679e89acdd3 clk: tegra: Avoid calling an uninitialized function

So, the included patch is:

marvin@defiant:~/linux/kernel/linux_torvalds$ git diff
diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
index 82ec6af71a1d..62d318377379 100644
--- a/include/uapi/linux/usb/ch9.h
+++ b/include/uapi/linux/usb/ch9.h
@@ -984,7 +984,11 @@ struct usb_ssp_cap_descriptor {
#define USB_SSP_MIN_RX_LANE_COUNT (0xf << 8)
#define USB_SSP_MIN_TX_LANE_COUNT (0xf << 12)
__le16 wReserved;
- __le32 bmSublinkSpeedAttr[1]; /* list of sublink speed attrib entries */
+ union {
+ __le32 legacy_padding;
+ /* list of sublink speed attrib entries */
+ __DECLARE_FLEX_ARRAY(__le32, bmSublinkSpeedAttr);
+ };
#define USB_SSP_SUBLINK_SPEED_SSID (0xf) /* sublink speed ID */
#define USB_SSP_SUBLINK_SPEED_LSE (0x3 << 4) /* Lanespeed exponent */
#define USB_SSP_SUBLINK_SPEED_LSE_BPS 0
marvin@defiant:~/linux/kernel/linux_torvalds$

This means vanilla torvalds tree + https://lore.kernel.org/lkml/20230629190900.never.787-kees@xxxxxxxxxx/
works, but vanilla torvalds tree w/o patch still crashes.

Great, thanks again for testing it all!

No at all, I'm glad I could be of assistance.

Best regards,
Mirsad Todorovac

-Keed


I am still rather new to the utilisation of the PSTORE subsystem.

Best regards,
Mirsad Todorovac