disabling secondary CPU hangs / system fails to suspend with kernel 4.19+

From: Thomas MÃller
Date: Thu Mar 14 2019 - 11:29:48 EST


Hi,

starting with kernel 4.19 my Lenovo ThinkPad X1 Carbon 5th no longer properly suspends.

This is 100% reproducible and git bisect points to the following commit:
> [be45bf5395e0886a93fc816bbe41a008ec2e42e2] watchdog/softlockup: Fix cpu_stop_queue_work() double-queue bug
> be45bf5395e0886a93fc816bbe41a008ec2e42e2 is the first bad commit
> commit be45bf5395e0886a93fc816bbe41a008ec2e42e2
> Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Date: Fri Jul 13 12:42:08 2018 +0200
>
> watchdog/softlockup: Fix cpu_stop_queue_work() double-queue bug
>
> When scheduling is delayed for longer than the softlockup interrupt
> period it is possible to double-queue the cpu_stop_work, causing list
> corruption.
>
> Cure this by adding a completion to track the cpu_stop_work's
> progress.
>
> Reported-by: kernel test robot <lkp@xxxxxxxxx>
> Tested-by: Rong Chen <rong.a.chen@xxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Fixes: 9cf57731b63e ("watchdog/softlockup: Replace "watchdog/%u" threads with cpu_stop_work")
> Link: http://lkml.kernel.org/r/20180713104208.GW2494@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
>
> :040000 040000 6aca2dbb84bc33fe442b18b3d0a135c27adff7b9 2710af12d32e4b98df07768716689b213bce45fc M kernel

The bugzilla reports have some additional details:
* https://bugzilla.redhat.com/show_bug.cgi?id=1671504
* https://bugzilla.kernel.org/show_bug.cgi?id=202679
* https://bugzilla.kernel.org/show_bug.cgi?id=202137

I'm happy to provide additional information or test a patch or two (as long as it doesn't
eat up my notebook ;))


Best regards
Thomas