Re: [CRASH][BISECTED] 6.4.1 crash in boot

From: Kees Cook
Date: Tue Jul 04 2023 - 22:10:11 EST


On July 4, 2023 4:15:20 PM PDT, Mirsad Todorovac <mirsad.todorovac@xxxxxxxxxxxx> wrote:
>On 7/4/23 23:36, Kees Cook wrote:
>> On July 4, 2023 10:20:11 AM PDT, Mirsad Todorovac <mirsad.todorovac@xxxxxxxxxxxx> wrote:
>>> On 7/4/23 01:09, Kees Cook wrote:> On Mon, Jul 03, 2023 at 12:03:23PM -0700, Kees Cook wrote:
>>>>> Cool. xhci-hub is in your backtrace, and the above patch was made for
>>>>> something very similar (though, again, I don't see why you're getting a
>>>>> _crash_, it should _warn_ and continue normally). And, actually, also
>>>>> include this patch:
>>>>> https://lore.kernel.org/lkml/20230614181307.gonna.256-kees@xxxxxxxxxx/
>>>>
>>>> This is now in Linus's tree:
>>>> 09b69dd4378b ("usb: ch9: Replace 1-element array with flexible array")
>>>>
>>>> Please also still try with the first patch I mentioned, which is very similar:
>>>> https://lore.kernel.org/lkml/20230629190900.never.787-kees@xxxxxxxxxx/
>>>
>>> Hi,
>>>
>>> I have finally built w both patches (and recommended PSTORE settings were
>>> default already).
>>
>> Were you able to find the crashes saved by pstore?
>
>No, only lktdm and invalid opcode crashes ...
>
>P.S.
>
>Actually, I have recovered some pstore records. Please find them in the attachment:
>
>>> This second patch fixes the booting problem, but alas there is still a problem -
>>
>> Ah! That's great! They're is still an unexpected crash source, but the trigger is fixed.
>
>Glad I could be of help.
>
>>> all Wayland and X11.org GUI applications fail to start, with errors like this one:
>>>
>>> Jul 4 19:09:07 defiant kernel: [ 40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>>
>> Hmm, is CONFIG_UBSAN_TRAP set?
>
>marvin@defiant:~/linux/kernel/linux_torvalds$ grep CONFIG_UBSAN_TRAP .config
>CONFIG_UBSAN_TRAP=y

Ah-ha! Turn that off please. With it off you will get much more useful reports from USBAN.

>marvin@defiant:~/linux/kernel/linux_torvalds$
>
>>> Jul 4 19:09:07 defiant kernel: [ 40.529726] RIP: 0010:alloc_pid+0x46c/0x480
>>
>> Hmm, is this patch in your kernel?
>> https://git.kernel.org/linus/b69f0aeb068980af983d399deafc7477cec8bc04
>
>No, it wasn't. I had only these:
>
>marvin@defiant:~/linux/kernel/linux_torvalds$ more ../kees-[12].patch
>::::::::::::::
>../kees-1.patch
>::::::::::::::
>diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
>index b17e3a21b15f..82ec6af71a1d 100644
>--- a/include/uapi/linux/usb/ch9.h
>+++ b/include/uapi/linux/usb/ch9.h
>@@ -376,7 +376,10 @@ struct usb_string_descriptor {
> __u8 bLength;
> __u8 bDescriptorType;
> - __le16 wData[1]; /* UTF-16LE encoded */
>+ union {
>+ __le16 legacy_padding;
>+ __DECLARE_FLEX_ARRAY(__le16, wData); /* UTF-16LE encoded */
>+ };
> } __attribute__ ((packed));
> /* note that "string" zero is special, it holds language codes that
>::::::::::::::
>../kees-2.patch
>::::::::::::::
>diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
>index b17e3a21b15f..3ff98c7ba7e3 100644
>--- a/include/uapi/linux/usb/ch9.h
>+++ b/include/uapi/linux/usb/ch9.h
>@@ -981,7 +981,11 @@ struct usb_ssp_cap_descriptor {
> #define USB_SSP_MIN_RX_LANE_COUNT (0xf << 8)
> #define USB_SSP_MIN_TX_LANE_COUNT (0xf << 12)
> __le16 wReserved;
>- __le32 bmSublinkSpeedAttr[1]; /* list of sublink speed attrib entries */
>+ union {
>+ __le32 legacy_padding;
>+ /* list of sublink speed attrib entries */
>+ __DECLARE_FLEX_ARRAY(__le32, bmSublinkSpeedAttr);
>+ };
> #define USB_SSP_SUBLINK_SPEED_SSID (0xf) /* sublink speed ID */
> #define USB_SSP_SUBLINK_SPEED_LSE (0x3 << 4) /* Lanespeed exponent */
> #define USB_SSP_SUBLINK_SPEED_LSE_BPS 0
>marvin@defiant:~/linux/kernel/linux_torvalds$
>
>---------------------------------------------------------
>
>Now it works. Succeeded boot and running of X apps with the new git pull
>torvalds tree and the kees-2.patch.

Perfect! Okay, so it looks like all the issues are known and fixed. I'll work with Greg to get the other ch9 patch landed.

>
>Praise God!
>
>This is the git log --oneline:
>
>d528014517f2 (HEAD, origin/master, origin/HEAD) Revert ".gitignore: ignore *.cover and *.mbx"
>04f2933d375e Merge tag 'core_guards_for_6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue
>03275585cabd afs: Fix accidental truncation when storing data
>538140ca602b Merge tag 'ovl-update-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs
>94c76955e86a Merge tag 'gfs2-v6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
>ccf46d853183 Merge tag 'pm-6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
>b869e9f49964 Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
>406fb9eb198a Merge tag 'firewire-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
>f1962207150c module: fix init_module_from_file() error handling
>40c565a429d7 Merge branches 'pm-cpufreq' and 'pm-cpuidle'
>f679e89acdd3 clk: tegra: Avoid calling an uninitialized function
>
>So, the included patch is:
>
>marvin@defiant:~/linux/kernel/linux_torvalds$ git diff
>diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
>index 82ec6af71a1d..62d318377379 100644
>--- a/include/uapi/linux/usb/ch9.h
>+++ b/include/uapi/linux/usb/ch9.h
>@@ -984,7 +984,11 @@ struct usb_ssp_cap_descriptor {
> #define USB_SSP_MIN_RX_LANE_COUNT (0xf << 8)
> #define USB_SSP_MIN_TX_LANE_COUNT (0xf << 12)
> __le16 wReserved;
>- __le32 bmSublinkSpeedAttr[1]; /* list of sublink speed attrib entries */
>+ union {
>+ __le32 legacy_padding;
>+ /* list of sublink speed attrib entries */
>+ __DECLARE_FLEX_ARRAY(__le32, bmSublinkSpeedAttr);
>+ };
> #define USB_SSP_SUBLINK_SPEED_SSID (0xf) /* sublink speed ID */
> #define USB_SSP_SUBLINK_SPEED_LSE (0x3 << 4) /* Lanespeed exponent */
> #define USB_SSP_SUBLINK_SPEED_LSE_BPS 0
>marvin@defiant:~/linux/kernel/linux_torvalds$
>
>This means vanilla torvalds tree + https://lore.kernel.org/lkml/20230629190900.never.787-kees@xxxxxxxxxx/
>works, but vanilla torvalds tree w/o patch still crashes.

Great, thanks again for testing it all!

-Keed

>
>I am still rather new to the utilisation of the PSTORE subsystem.
>
>Best regards,
>Mirsad Todorovac

--
Kees Cook