Re: simple repro case

From: Steven Rostedt
Date: Wed Jun 21 2017 - 10:25:05 EST


On Sat, 17 Jun 2017 08:11:49 -0400
Tejun Heo <tj@xxxxxxxxxx> wrote:

> Here's a simple rerpo. The test code runs whenever a CPU goes
> off/online. The test kthread is created on a different CPU and
> migrated to the target CPU while running. Without the previous patch
> applied, the kthread ends up running on the wrong CPU.
>

Hmm, I'm not able to trigger the warn_on, with this patch applied.

Adding a trace_printk("here!\n") just above the warn_on in
wq_worker_sleeping(), and doing the following:

# cd /sys/kernel/debug/tracing
# echo 1 > events/printk/console/enable
# echo 1 > events/sched/sched_migrate_task/enable
# echo 0 > /sys/devices/system/cpu/cpu2/online
# echo 1 > /sys/devices/system/cpu/cpu2/online
# cat trace
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d.s4 744.572889: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=1 dest_cpu=0
kworker/0:2-406 [000] d..2 744.750944: wq_worker_sleeping: here
kworker/u8:3-111 [001] d..3 746.029980: sched_migrate_task: comm=bash pid=2390 prio=120 orig_cpu=1 dest_cpu=2
kworker/0:2-406 [000] d..2 746.798796: wq_worker_sleeping: here
<idle>-0 [001] d.s4 747.164692: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=0 dest_cpu=1
<idle>-0 [000] d.s4 747.632662: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=1 dest_cpu=0
kworker/0:2-406 [000] d..2 748.846646: wq_worker_sleeping: here
kworker/0:2-406 [000] d..2 750.894525: wq_worker_sleeping: here
<idle>-0 [002] d.s4 751.188404: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=0 dest_cpu=2
kworker/2:3-118 [002] d..2 751.194212: wq_worker_sleeping: here
cpuhp/2-20 [002] d..1 751.204894: console: [ 751.018261] TEST: cpu 2 inactive, starting on 0 and migrating (active/online=0-1,3/0-3)
test-2633 [000] d.s3 751.216796: sched_migrate_task: comm=systemd-journal pid=1839 prio=120 orig_cpu=0 dest_cpu=1
systemd-journal-1839 [001] d..4 751.216851: sched_migrate_task: comm=jbd2/dm-0-8 pid=1754 prio=120 orig_cpu=0 dest_cpu=3
cpuhp/2-20 [002] d..1 751.318375: console: [ 751.131745] TEST: test_last_cpu=0 cpus_allowed=0
cpuhp/2-20 [002] d..1 751.324249: console: [ 751.137621] TEST: migrating to inactve cpu 2
cpuhp/2-20 [002] d..1 751.438368: console: [ 751.251738] TEST: test_last_cpu=0 cpus_allowed=2
bash-2390 [000] d..1 751.445862: console: [ 751.259232] smpboot: CPU 2 is now offline
<idle>-0 [000] d.h5 751.452370: sched_migrate_task: comm=systemd-journal pid=1839 prio=120 orig_cpu=1 dest_cpu=0
<idle>-0 [000] dNs3 751.452392: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=2 dest_cpu=0
bash-2390 [000] d.s2 751.464359: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=0 dest_cpu=1
bash-2390 [000] d..2 751.466028: sched_migrate_task: comm=kworker/2:3 pid=118 prio=120 orig_cpu=2 dest_cpu=0
bash-2390 [000] d..4 751.466236: sched_migrate_task: comm=libvirtd pid=2197 prio=120 orig_cpu=2 dest_cpu=0
bash-2390 [000] d..3 751.466479: sched_migrate_task: comm=kworker/u8:5 pid=113 prio=120 orig_cpu=3 dest_cpu=1
kworker/u8:5-113 [001] d..3 751.466519: sched_migrate_task: comm=sshd pid=2389 prio=120 orig_cpu=0 dest_cpu=1
kworker/u8:5-113 [001] d..3 751.466593: sched_migrate_task: comm=sshd pid=2389 prio=120 orig_cpu=1 dest_cpu=0
systemd-udevd-2634 [001] d.s3 751.468358: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=1 dest_cpu=3
<idle>-0 [000] d.s4 751.472372: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=3 dest_cpu=0
systemd-journal-1839 [000] d..3 751.530744: sched_migrate_task: comm=in:imjournal pid=2152 prio=120 orig_cpu=2 dest_cpu=0
in:imjournal-2152 [000] d..2 751.530859: sched_migrate_task: comm=rs:main Q:Reg pid=2360 prio=120 orig_cpu=1 dest_cpu=3
in:imjournal-2152 [000] d.s3 751.531370: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=0 dest_cpu=1
sendmail-2334 [001] d.s3 752.169327: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=1 dest_cpu=3
kworker/0:2-406 [000] d..2 752.942348: wq_worker_sleeping: here
<idle>-0 [000] d.s3 753.582220: sched_migrate_task: comm=khugepaged pid=48 prio=139 orig_cpu=2 dest_cpu=0
<idle>-0 [000] d.s4 753.772209: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=3 dest_cpu=0
kworker/u8:5-113 [001] d..3 754.947065: sched_migrate_task: comm=sshd pid=2389 prio=120 orig_cpu=0 dest_cpu=3
bash-2390 [000] d.s3 754.956107: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=0 dest_cpu=1
bash-2390 [000] d..1 754.959938: console: [ 754.773306] smpboot: Booting Node 0 Processor 2 APIC 0x4
migration/1-15 [001] d..4 754.968577: sched_migrate_task: comm=bash pid=2390 prio=120 orig_cpu=0 dest_cpu=1
cpuhp/2-20 [002] d..1 754.970747: console: [ 754.784112] TEST: cpu 2 inactive, starting on 0 and migrating (active/online=0-1,3/0-3)
cpuhp/2-20 [002] d..2 754.981195: sched_migrate_task: comm=test pid=2635 prio=120 orig_cpu=3 dest_cpu=0
kworker/0:2-406 [000] d..2 754.990133: wq_worker_sleeping: here
cpuhp/2-20 [002] d..1 755.086101: console: [ 754.899471] TEST: test_last_cpu=0 cpus_allowed=0
cpuhp/2-20 [002] d..1 755.091957: console: [ 754.905328] TEST: migrating to inactve cpu 2
cpuhp/2-20 [002] d..3 755.097453: sched_migrate_task: comm=test pid=2635 prio=120 orig_cpu=0 dest_cpu=2
cpuhp/2-20 [002] d..1 755.206090: console: [ 755.019460] TEST: test_last_cpu=2 cpus_allowed=2
cpuhp/2-20 [002] dN.3 755.213683: sched_migrate_task: comm=kworker/2:3 pid=118 prio=120 orig_cpu=0 dest_cpu=2
cpuhp/2-20 [002] dN.3 755.216126: sched_migrate_task: comm=bash pid=2390 prio=120 orig_cpu=1 dest_cpu=0
kworker/2:3-118 [002] d..2 755.216143: wq_worker_sleeping: here
bash-2390 [000] d..3 755.216386: sched_migrate_task: comm=kworker/u8:5 pid=113 prio=120 orig_cpu=1 dest_cpu=2
kworker/u8:5-113 [002] d..3 755.216425: sched_migrate_task: comm=sshd pid=2389 prio=120 orig_cpu=3 dest_cpu=2
systemd-udevd-2636 [001] d.s4 755.217088: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=1 dest_cpu=2
systemd-udevd-2636 [001] d..4 755.217613: sched_migrate_task: comm=libvirtd pid=2197 prio=120 orig_cpu=0 dest_cpu=2
systemd-udevd-2637 [000] d..5 755.218850: sched_migrate_task: comm=systemd-udevd pid=1877 prio=120 orig_cpu=3 dest_cpu=0
systemd-udevd-1877 [000] d..7 755.220192: sched_migrate_task: comm=systemd-udevd pid=2637 prio=120 orig_cpu=0 dest_cpu=1
systemd-udevd-1877 [000] d..7 755.220212: sched_migrate_task: comm=systemd-udevd pid=2636 prio=120 orig_cpu=1 dest_cpu=2
<idle>-0 [000] d.s4 755.223118: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=2 dest_cpu=0
<idle>-0 [002] d.s4 755.225101: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=0 dest_cpu=2
systemd-journal-1839 [000] d..3 755.280473: sched_migrate_task: comm=in:imjournal pid=2152 prio=120 orig_cpu=0 dest_cpu=2
<idle>-0 [000] d.s8 755.722735: sched_migrate_task: comm=sshd pid=2389 prio=120 orig_cpu=2 dest_cpu=0
<idle>-0 [000] d.s4 756.334014: sched_migrate_task: comm=kworker/u8:5 pid=113 prio=120 orig_cpu=2 dest_cpu=0
sshd-2389 [000] d..3 756.350661: sched_migrate_task: comm=kworker/u8:5 pid=113 prio=120 orig_cpu=0 dest_cpu=1
<idle>-0 [000] d.s4 756.831984: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=2 dest_cpu=0
<idle>-0 [001] d.s3 756.845975: sched_migrate_task: comm=jbd2/dm-0-8 pid=1754 prio=120 orig_cpu=3 dest_cpu=1
bash-2390 [000] d.s3 756.846974: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=0 dest_cpu=1
<idle>-0 [000] d.s5 756.875008: sched_migrate_task: comm=jbd2/dm-0-8 pid=1754 prio=120 orig_cpu=1 dest_cpu=0
kworker/0:2-406 [000] d..2 757.038044: wq_worker_sleeping: here
kworker/u8:5-113 [001] d..3 758.142638: sched_migrate_task: comm=sshd pid=2389 prio=120 orig_cpu=0 dest_cpu=2
bash-2645 [001] dNs4 758.143899: sched_migrate_task: comm=rcu_sched pid=8 prio=120 orig_cpu=1 dest_cpu=2

Only the kworker/0:2-406 and kworker/2:3-113 ran the warnon path, and
was fine.

-- Steve