Re: R8169: Network lockups in 4.18.{8,9,10} (and 4.19 dev)

From: Chris Clayton
Date: Wed Oct 10 2018 - 04:51:35 EST


Sorry, I forgot that editing r8169.c and rebuilding would result in rc7+, so I tested the wrong kernel/module to get the
results I provided below. That, however, may make the results more interesting because they happened with a virgin rc7
kernel/module.

I'll test your proposals properly later.

Chris

On 10/10/2018 09:09, Chris Clayton wrote:
>
>
> On 10/10/2018 01:24, Maciej S. Szmigiero wrote:
>> On 09.10.2018 22:36, Heiner Kallweit wrote:
>>> On 09.10.2018 16:40, Chris Clayton wrote:
>>>> Thanks to Maciej and Heiner for their replies.
>>>>
>>>> On 09/10/2018 13:32, Maciej S. Szmigiero wrote:
>>>>> On 07.10.2018 21:36, Chris Clayton wrote:
>>>>>> Hi again,
>>>>>>
>>>>>> I didn't think there was anything in 4.19-rc7 to fix this regression, but tried it anyway. I can confirm that the
>>>>>> regression is still present and my network still fails when, after a resume from suspend (to ram or disk), I open my
>>>>>> browser or my mail client. In both those cases the failure is almost immediate - e.g. my home page doesn't get displayed
>>>>>> in the browser. Pinging one of my ISPs name servers doesn't fail quite so quickly but the reported time increases from
>>>>>> 14-15ms to more than 1000ms.
>>>>>
>>>>> You can try comparing chip registers (ethtool -d eth0) in the working
>>>>> state (before a suspend) and in the broken state (after a resume).
>>>>> Maybe there will be some obvious in the difference.
>>>>>
>>>>> The same goes for the PCI configuration (lspci -d :8168 -vv).
>>>>>
>>>> Maciej suggested comparing the output from lspci -vv for the ethernet device. They are identical.
>>>>
>>>> Both Maciej and Heiner suggested comparing the output from "ethtool -d" pre and post suspend. Again, they are identical.
>>>> Heiner specifically suggested looking at the RxConfig. The value of that is 0x0002870e both pre and post suspend.
>>>>
>>> Hmm, this is very weird, especially taking into account that in your original
>>> report you state that removing the call to rtl_init_rxcfg() from rtl_hw_start()
>>> fixes the issue. rtl_init_rxcfg() deals with the RxConfig register only and
>>> register values seem to be the same before and after resume. So how can the
>>> chip behave differently?
>>> So far my best guess is that some chip quirk causes it to accept writes to
>>> register RxConfig, but to misinterpret or ignore the written value.
>>> So far your report is the only one (affecting RTL8411), but we don't know
>>> whether other chip versions are affected too.
>>
>> Also, it is interesting that even if one removes a call to
>> rtl_init_rxcfg() from rtl_hw_start() the RxConfig register will still get
>> written to moments later by rtl_set_rx_mode().
>>
>> The only chip accesses in the meantime seems to be a write to TxConfig by
>> rtl_set_tx_config_registers() and then a read of RxConfig plus two writes
>> to MAR0 earlier in rtl_set_rx_mode().
>>
>> My proposals are:
>> 1) Try swapping "rtl_init_rxcfg(tp);" and "rtl_set_tx_config_registers(tp);"
>> in rtl_hw_start().
>> Maybe the chip does not like sometimes that RxConfig is written before
>> TxConfig.
>>
> After testing your first proposal, which made no difference, I founf the following in dmesg in the output from dmesg:
>
> [ 761.999468] ------------[ cut here ]------------
> [ 761.999471] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> [ 761.999483] WARNING: CPU: 0 PID: 8938 at net/sched/sch_generic.c:461 dev_watchdog+0x1e9/0x1f0
> [ 761.999484] Modules linked in: btusb btintel r8169 rfcomm bnep iptable_filter xt_conntrack iptable_nat ipt_MASQUERADE
> nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv4 uvcvideo videobuf2_vmalloc videobuf2_memops snd_hda_codec_via
> videobuf2_v4l2 snd_hda_codec_hdmi snd_hda_codec_generic videobuf2_common usbhid realtek coretemp snd_hda_intel hwmon
> snd_hda_codec x86_pkg_temp_thermal snd_hwdep libphy snd_hda_core [last unloaded: btintel]
> [ 761.999503] CPU: 0 PID: 8938 Comm: kworker/0:0 Not tainted 4.19.0-rc7 #328
> [ 761.999504] Hardware name: Notebook W65_67SZ /W65_67SZ
> , BIOS 1.03.05 02/26/2014
> [ 761.999508] Workqueue: events rtl_task [r8169]
> [ 761.999510] RIP: 0010:dev_watchdog+0x1e9/0x1f0
> [ 761.999512] Code: 00 48 63 4d e8 eb 99 4c 89 ef c6 05 b6 13 a6 00 01 e8 1b c7 fd ff 89 d9 4c 89 ee 48 c7 c7 40 53 e1
> 81 48 89 c2 e8 ae f4 a3 ff <0f> 0b eb c0 0f 1f 00 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00
> [ 761.999513] RSP: 0018:ffff88040f803e98 EFLAGS: 00010282
> [ 761.999514] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
> [ 761.999516] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff88040f8153d0
> [ 761.999517] RBP: ffff88040ca9a3b8 R08: ffffffff813565f0 R09: 000000000000034e
> [ 761.999517] R10: 0000000000000007 R11: 0000000000000000 R12: ffff88040ca9a39c
> [ 761.999518] R13: ffff88040ca9a000 R14: 0000000000000001 R15: ffff8803ea17cc80
> [ 761.999520] FS: 0000000000000000(0000) GS:ffff88040f800000(0000) knlGS:0000000000000000
> [ 761.999521] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 761.999522] CR2: 00007f67280206b8 CR3: 000000000200a002 CR4: 00000000001606f0
> [ 761.999523] Call Trace:
> [ 761.999525] <IRQ>
> [ 761.999527] ? qdisc_reset+0xe0/0xe0
> [ 761.999529] ? qdisc_reset+0xe0/0xe0
> [ 761.999532] call_timer_fn+0x11/0x70
> [ 761.999534] expire_timers+0x8e/0xa0
> [ 761.999535] run_timer_softirq+0x7e/0x150
> [ 761.999538] ? __hrtimer_run_queues+0x12b/0x1a0
> [ 761.999541] ? recalibrate_cpu_khz+0x10/0x10
> [ 761.999543] ? ktime_get+0x32/0x90
> [ 761.999546] ? lapic_next_event+0x20/0x20
> [ 761.999549] __do_softirq+0xcc/0x1fc
> [ 761.999552] irq_exit+0x82/0xb0
> [ 761.999554] smp_apic_timer_interrupt+0x61/0x90
> [ 761.999556] apic_timer_interrupt+0xf/0x20
> [ 761.999557] </IRQ>
> [ 761.999560] RIP: 0010:rtl_slow_event_work+0x2a/0x1f0 [r8169]
> [ 761.999562] Code: 41 56 41 55 41 54 55 53 48 89 fb 48 83 ec 10 4c 8b 67 10 65 48 8b 04 25 28 00 00 00 48 89 44 24 08
> 31 c0 48 8b 07 66 8b 68 3e <66> 23 af da 0d 00 00 48 8b 07 66 89 68 3e 40 f6 c5 40 0f 85 3b 01
> [ 761.999563] RSP: 0018:ffffc900014d7e40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> [ 761.999564] RAX: ffffc900000b9000 RBX: ffff88040ca9a7c0 RCX: ffff88040f81f160
> [ 761.999565] RDX: ffff8803ea21b300 RSI: 0000000000000000 RDI: ffff88040ca9a7c0
> [ 761.999566] RBP: ffff88040ca90050 R08: 0000000000000000 R09: 000073746e657665
> [ 761.999567] R10: 8080808080808080 R11: ffff88040f81ea68 R12: ffff88040ca9a000
> [ 761.999568] R13: ffff88040ca9a000 R14: ffff88040f81f140 R15: 0000000000000000
> [ 761.999571] ? __switch_to_asm+0x34/0x70
> [ 761.999573] rtl_task+0x4f/0x70 [r8169]
> [ 761.999576] process_one_work+0x1bc/0x2f0
> [ 761.999577] worker_thread+0x28/0x3c0
> [ 761.999579] ? process_one_work+0x2f0/0x2f0
> [ 761.999581] kthread+0x109/0x120
> [ 761.999583] ? kthread_park+0x80/0x80
> [ 761.999585] ret_from_fork+0x35/0x40
> [ 761.999586] ---[ end trace fd5800440feffc06 ]---
>
> I haven't seen this before, but maybe it's a consequence of swapping the order of the two functions calls.
>
> I'll work on the second proposal later today.
>
> Chris
>> 2) Check the original value of RxConfig (after a resume) before
>> rtl_init_rxcfg() overwrites it (compile tested only):
>> --- r8169.c.ori
>> +++ r8169.c
>> @@ -5155,6 +5155,9 @@
>> /* Initially a 10 us delay. Turned it into a PCI commit. - FR */
>> RTL_R8(tp, IntrMask);
>> RTL_W8(tp, ChipCmd, CmdTxEnb | CmdRxEnb);
>> +
>> + pr_notice("RxConfig before init was %.8x\n",
>> + (unsigned int)RTL_R32(tp, RxConfig));
>> rtl_init_rxcfg(tp);
>> rtl_set_tx_config_registers(tp);
>>
>>
>> This should be the value that you got when you removed the call to
>> rtl_init_rxcfg() for testing.
>> Now, knowing the "right" value you can experiment with what rtl_init_rxcfg()
>> writes (under the "default:" label for your NIC model).
>>
>> Hope this helps,
>> Maciej
>>