Re: mmiotracer hangs the system

From: Andy Shevchenko
Date: Tue Aug 02 2016 - 11:06:09 EST


+Cc: Karol, Ingo

On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko
<andy.shevchenko@xxxxxxxxx> wrote:
> Hi!
>
> I'm trying to use mmio tracer with recent kernels (in this particular
> case today's linux-next).

Tested on other board and found that v4.5 works while v4.5.7 doesn't.
Bisecting to

commit d62a28a60562a8ba82e67e13c268245f37e796cb
Author: Karol Herbst <nouveau@xxxxxxxxxxxxxx>
Date: Thu Mar 3 02:03:11 2016 +0100

x86/mm/kmmio: Fix mmiotrace for hugepages

commit cfa52c0cfa4d727aa3e457bf29aeff296c528a08 upstream.

Reverting _helps_ for x86 and x86_64 builds.

>
> # mount -t debugfs none /sys/kernel/debug/
>
> # echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
> [ 869.673145] in mmio_trace_init
> [ 869.714170] mmiotrace: Disabling non-boot CPUs...
> [ 869.729938] Cannot set affinity for irq 169
> [ 869.735765] smpboot: CPU 1 is now offline
> [ 869.746662] mmiotrace: CPU1 is down.
> [ 869.757896] smpboot: CPU 2 is now offline
> [ 869.773572] mmiotrace: CPU2 is down.
> [ 869.781768] smpboot: CPU 3 is now offline
> [ 869.789495] mmiotrace: CPU3 is down.
> [ 869.793515] mmiotrace: enabled.
>
> # echo 1 > /sys/kernel/debug/tracing/tracing_on
> [ 869.802634] in mmio_trace_start
>
> # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/unbind
> [ 883.625744] mmiotrace: Unmapping ffffc90000854000.
> [ 883.633925] mmiotrace: Unmapping ffffc90000852000.
> [ 883.644580] mmiotrace: Unmapping ffffc90000850000.
>
> # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/bind
> [ 889.525125] mmiotrace: ioremap_*(0x9242e200, 0x100) = ffffc90000856200
>
> [ 910.533911] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 910.540052] (detected by 0, t=21002 jiffies, g=348, c=347, q=0)
> [ 910.546790] All QSes seen, last rcu_sched kthread activity 21002
> (4295577777-4295556775), jiffies_till_next_fqs=3, root ->qsm
> ask 0x0
> [ 910.560142] sh R running task 12336 1289 1 0x20020008
> [ 910.568055] ffffffff81e422c0 ffff88017fc03e20 ffffffff81085296
> ffff88017fc18500
> [ 910.576366] ffffffff81e422c0 ffff88017fc03e88 ffffffff810b547d
> 0000000000000000
> [ 910.584675] ffffffff810f67ec 000000000000015c 0000000000000000
> 000000000000015c
> [ 910.592980] Call Trace:
> [ 910.595715] <IRQ> [<ffffffff81085296>] sched_show_task+0xb6/0x110
> [ 910.602748] [<ffffffff810b547d>] rcu_check_callbacks+0x84d/0x850
> [ 910.609573] [<ffffffff810f67ec>] ? __acct_update_integrals+0x2c/0xb0
> [ 910.616788] [<ffffffff810c9150>] ? tick_sched_do_timer+0x30/0x30
> [ 910.623613] [<ffffffff810ba34a>] update_process_times+0x2a/0x50
> [ 910.630343] [<ffffffff810c8bb1>] tick_sched_handle.isra.12+0x31/0x40
> [ 910.637560] [<ffffffff810c9188>] tick_sched_timer+0x38/0x70
> [ 910.643902] [<ffffffff810bacba>] __hrtimer_run_queues+0xda/0x250
> [ 910.650734] [<ffffffff810bb3f3>] hrtimer_interrupt+0xa3/0x190
> [ 910.657272] [<ffffffff8103ead3>] local_apic_timer_interrupt+0x33/0x50
> [ 910.664584] [<ffffffff8103f588>] smp_apic_timer_interrupt+0x38/0x50
> [ 910.671705] [<ffffffff8190dd6f>] apic_timer_interrupt+0x7f/0x90
> [ 910.678427] <EOI> [<ffffffff814a717f>] ? intel_lpss_probe+0x7f/0x5f0
> [ 910.685739] [<ffffffff814a716b>] ? intel_lpss_probe+0x6b/0x5f0
> [ 910.692364] [<ffffffff8170e5df>] ? raw_pci_write+0x1f/0x40
> [ 910.698610] [<ffffffff8136e825>] ? pci_bus_write_config_byte+0x55/0x70
> [ 910.706022] [<ffffffff813781b1>] ? pcibios_set_master+0x51/0x80
> [ 910.712753] [<ffffffff814a7836>] intel_lpss_pci_probe+0x76/0xb0
> [ 910.719479] [<ffffffff813797e0>] local_pci_probe+0x40/0xa0
> [ 910.725719] [<ffffffff811fce44>] ? sysfs_do_create_link_sd.isra.2+0x64/0xa0
> [ 910.733617] [<ffffffff8137ab46>] pci_device_probe+0xd6/0x120
> [ 910.740058] [<ffffffff8148679f>] driver_probe_device+0x21f/0x430
> [ 910.746883] [<ffffffff81484c4f>] bind_store+0x10f/0x160
> [ 910.752836] [<ffffffff81484150>] drv_attr_store+0x20/0x30
> [ 910.758983] [<ffffffff811fc312>] sysfs_kf_write+0x32/0x40
> [ 910.765129] [<ffffffff811fb863>] kernfs_fop_write+0x113/0x190
> [ 910.771663] [<ffffffff81185343>] __vfs_write+0x23/0x120
> [ 910.777607] [<ffffffff812cfd46>] ? security_file_permission+0x36/0xb0
> [ 910.784918] [<ffffffff810998dd>] ? percpu_down_read+0xd/0x50
> [ 910.791351] [<ffffffff81186403>] vfs_write+0xb3/0x1b0
> [ 910.797103] [<ffffffff81187711>] SyS_write+0x41/0xa0
> [ 910.802758] [<ffffffff81002b2e>] do_int80_syscall_32+0x4e/0xa0
> [ 910.809389] [<ffffffff8190f2aa>] entry_INT80_compat+0x2a/0x40
> [ 910.815925] rcu_sched kthread starved for 21002 jiffies! g348 c347
> f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
> [ 910.826351] rcu_sched R running task 14896 7 2 0x00000000
> [ 910.834260] ffff88017ab7bd88 ffff880179c17080 ffff88017ab34b00
> 0000000000000000
> [ 910.842550] ffff88017ab7c000 ffff88017ab7bdd0 00000000ffffffff
> 0000000000000000
> [ 910.850836] ffff88017fc0ec40 ffff88017ab7bda0 ffffffff81909370
> 000000010008feaa
> [ 910.859118] Call Trace:
> [ 910.861852] [<ffffffff81909370>] schedule+0x30/0x80
> [ 910.867408] [<ffffffff8190c3b9>] schedule_timeout+0x209/0x410
> [ 910.873938] [<ffffffff810b8760>] ? init_timer_key+0xa0/0xa0
> [ 910.880267] [<ffffffff81097aab>] ? prepare_to_swait+0x5b/0x80
> [ 910.886793] [<ffffffff810b3e09>] rcu_gp_kthread+0x479/0x800
> [ 910.893124] [<ffffffff810b3990>] ? call_rcu_sched+0x20/0x20
> [ 910.899458] [<ffffffff81079f54>] kthread+0xc4/0xe0
> [ 910.904917] [<ffffffff8190d3cf>] ret_from_fork+0x1f/0x40
> [ 910.910961] [<ffffffff81079e90>] ? kthread_worker_fn+0x160/0x160
>
>
> Is it bug in the driver or somewhere else?
>
> --
> With Best Regards,
> Andy Shevchenko



--
With Best Regards,
Andy Shevchenko