Re: [PATCH] mce: fix missing stack-dumping in mce_panic()

From: Miaohe Lin
Date: Sat Dec 03 2022 - 01:49:49 EST


On 2022/12/3 12:22, Tony Luck wrote:
> On Sat, Dec 03, 2022 at 10:19:32AM +0800, Miaohe Lin wrote:
>> So I think it's better to have at least one stack dumps. Also what the commit
>> 6e6f0a1f0fa6 ("panic: don't print redundant backtraces on oops") and commit
>> 026ee1f66aaa ("panic: fix stack dump print on direct call to panic()") want
>> to do is avoiding nested stack-dumping to have the original oops data being
>> scrolled away on a 80x50 screen but to have *at least one backtraces*. So
>> this patch acts more like a BUGFIX to ensure having at least one backtraces
>> in mce_panic(). What's your thought, Luck?
>
> I tried out your patch with the ras-tools test:

ras-tool is really convenient. :)

>
> # ./einj_mem_uc -f copyout
>
> which currently causes a panic from the "recoverable" machine check.
>
> Your patch worked fine:
>
> [ 112.457735] stack backtrace:
> [ 112.457736] CPU: 124 PID: 3401 Comm: einj_mem_uc Not tainted 6.1.0-rc7+ #41
> [ 112.457738] Hardware name: Intel Corporation PURLEY/PURLEY, BIOS PLYDCRB1.86B.0154.R04.1804231104 04/23/2018
> [ 112.457739] Call Trace:
> [ 112.457740] <#MC>
> [ 112.457742] dump_stack_lvl+0x5a/0x78
> [ 112.457746] dump_stack+0x10/0x16
> [ 112.457748] print_usage_bug.part.0+0x1ad/0x1c4
> [ 112.457755] lock_acquire.cold+0x16/0x47
> [ 112.457759] ? down_trylock+0x14/0x40
> [ 112.457762] ? panic+0x180/0x2b9
> [ 112.457766] _raw_spin_lock_irqsave+0x4e/0x70
> [ 112.457768] ? down_trylock+0x14/0x40
> [ 112.457771] down_trylock+0x14/0x40
> [ 112.457772] ? panic+0x180/0x2b9
> [ 112.457775] __down_trylock_console_sem+0x34/0xc0
> [ 112.457778] console_unblank+0x1d/0x90
> [ 112.457781] panic+0x180/0x2b9
> [ 112.457788] mce_panic+0x118/0x1e0
> [ 112.457794] do_machine_check+0x79a/0x890
> [ 112.457800] ? copy_user_enhanced_fast_string+0xa/0x50
> [ 112.457810] exc_machine_check+0x76/0xb0
> [ 112.457813] asm_exc_machine_check+0x1a/0x40
> [ 112.457816] RIP: 0010:copy_user_enhanced_fast_string+0xa/0x50
> [ 112.457819] Code: d1 f3 a4 31 c0 0f 01 ca c3 cc cc cc cc 8d 0c ca 89 ca eb 2c 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 01 cb 83 fa 40 72 48 89 d1 <f3> a4 31 c0 0f 01 ca c3 cc cc cc cc 89 ca eb 06 66 0f 1f 44 00 00
> [ 112.457820] RSP: 0018:ffffb140f789bbd8 EFLAGS: 00050206
> [ 112.457822] RAX: 0000000000001000 RBX: ffffb140f789be58 RCX: 0000000000000c00
> [ 112.457824] RDX: 0000000000001000 RSI: ffff9133eb3f0400 RDI: 000055c1a36986c0
> [ 112.457825] RBP: ffffb140f789bc68 R08: 00000000f789be01 R09: 0000000000001000
> [ 112.457827] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000001000
> [ 112.457828] R13: ffff9133eb3f0000 R14: 0000000000001000 R15: 0000000000000000
> [ 112.457837] </#MC>
> [ 112.457838] <TASK>
> [ 112.457838] ? _copy_to_iter+0xc3/0x6f0
> [ 112.457843] ? filemap_get_pages+0x9b/0x670
> [ 112.457851] copy_page_to_iter+0x7c/0x1f0
> [ 112.457854] ? find_held_lock+0x31/0x90
> [ 112.457858] filemap_read+0x1ec/0x390
> [ 112.457865] ? __fsnotify_parent+0x10f/0x310
> [ 112.457867] ? aa_file_perm+0x1ab/0x610
> [ 112.457875] generic_file_read_iter+0xf4/0x170
> [ 112.457879] ext4_file_read_iter+0x5b/0x1e0
> [ 112.457881] ? security_file_permission+0x4e/0x60
> [ 112.457886] vfs_read+0x208/0x2e0
> [ 112.457895] ksys_read+0x6d/0xf0
> [ 112.457900] __x64_sys_read+0x19/0x20
> [ 112.457902] do_syscall_64+0x38/0x90
> [ 112.457906] entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
>
> Tested-by: Tony Luck <tony.luck@xxxxxxxxx>
>
> -Tony

Many thanks for your test, Tony!

Thanks,
Miaohe Lin