Re: çå: [PATCH] tty: fix flush_to_ldisc() oops before tty_open is done

From: gregkh@xxxxxxxxxxxxxxxxxxx
Date: Fri Nov 03 2017 - 10:37:38 EST


On Fri, Nov 03, 2017 at 10:48:04AM +0000, taoyuhong wrote:
> Hi Alan
>
> Serial tty oops on real arm computer.
>
> That is Hikey960 board, with 8 cortex-a57 cpus and pl011 serial hardware.
> Debian8 is pre-installed, and I replaced the kernel with linux-stable 4.13.10.
>
> It is very easy to trigger tty oops, if there are massive input to the serial port.
> I made a script, send "reboot" repeatedly to the Hikey960 board,
> without any break. So that if system not oops on tty open it can
> reboot and play next round.
>
> After many times of reboot, the tty oops on system shutdown sequence.
> And I tried more it also happened during system startup.

How is things crashing on startup when you are messing with data being
sent on shutdown?

Are you still sending data at startup time?

> That is what it looks, this is startup tty oops:
> --------------------------------------------------------------
> EFI stub: Booting Linux Kernel...
> EFI stub: Using DTB from configuration table
> EFI stub: Exiting boot services and installing virtual address map...
> [ 0.000000] Booting Linux on physical CPU 0x0
> [ 0.000000] Linux version 4.13.10-linaro-hikey (root@tayuhong-Aspire-E1-471G) (gcc version 6.3.0 20170406 (Ubuntu/Linaro 6.3.0-12ubuntu2)) 7
> [ 0.000000] Boot CPU: AArch64 Processor [410fd033]
> [ 0.000000] Machine model: HiKey Development Board
> [ 0.000000] efi: Getting EFI parameters from FDT:
> [ 0.000000] efi: EFI v2.40 by Linaro HiKey EFI Nov 28 2015 10:50:07
> [ 0.000000] efi:
> [ 0.000000] Reserved memory: created CMA memory pool at 0x0000000072c00000, size 128 MiB
> [ 0.000000] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
> [ 0.000000] psci: probing for conduit method from DT.
> [ 0.000000] psci: PSCIv1.0 detected in firmware.
> [ 0.000000] psci: Using standard PSCI v0.2 function IDs
> [ 0.000000] psci: Trusted OS migration not required
> [ 0.000000] percpu: Embedded 24 pages/cpu @ffffffc07fe6d000 s59544 r8192 d30568 u98304
> [ 0.000000] Detected VIPT I-cache on CPU0
> [ 0.000000] CPU features: enabling workaround for ARM erratum 845719
> [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 507406
> [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/Image413 console=ttyAMA3,115200 root=/dev/disk/by-partlabel/system rw
> ... ...
> Begin: Running /scripts/local-bottom ... done.
> Begin: Running /scripts/init-bottom ... done.
>
> Welcome to Debian GNU/Linux 8 (jessie)!
>
> Expecting device dev-ttyAMA3.device...
> [ OK ] Reached target Remote File Systems (Pre).
> [ OK ] Set up automount Arbitrary Executable File Formats F...utomount Point.
> ... ....
> [ OK ] Started LSB: network benchmark.
> [ OK ] Started LSB: Advanced IEEE 802.11 management daemon.
> [ OK ] Reached target Multi-User System.
> [ OK ] Reached target Graphical Interface.
> Starting Update UTMP about System Runlevel Changes...
> [ OK ] Started Update UTMP about System Runlevel Changes.
> [ 7.753576] Unable to handle kernel paging request at virtual address 00002260
> [ 7.760855] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffc07bc31000
> [ 7.760859] [0000000000002260] *pgd=0000000000000000, *pud=0000000000000000
> [ 7.760873] Internal error: Oops: 96000005 [#1] PREEMPT SMP
> [ 7.760877] Modules linked in:
> [ 7.760886] CPU: 0 PID: 54 Comm: kworker/u16:2 Tainted: G W 4.13.10-linaro-hikey #4
> [ 7.760889] Hardware name: HiKey Development Board (DT)
> [ 7.760904] Workqueue: events_unbound flush_to_ldisc
> [ 7.760908] task: ffffffc07d08f080 task.stack: ffffffc07d17c000
> [ 7.760915] PC is at n_tty_receive_buf_common+0x64/0xb18
> [ 7.760920] LR is at n_tty_receive_buf_common+0x48/0xb18
> [ 7.760924] pc : [<ffffff80084dc5dc>] lr : [<ffffff80084dc5c0>] pstate: 80000145
> .... ....
> [ 7.761190] [<ffffff80084dc5dc>] n_tty_receive_buf_common+0x64/0xb18
> [ 7.761195] [<ffffff80084dd0a0>] n_tty_receive_buf2+0x10/0x18
> [ 7.761200] [<ffffff80084dfb50>] tty_ldisc_receive_buf+0x20/0x70
> [ 7.761206] [<ffffff80084e06e0>] tty_port_default_receive_buf+0x40/0x78
> [ 7.761211] [<ffffff80084dfd64>] flush_to_ldisc+0xb4/0xc8
> [ 7.761221] [<ffffff80080d345c>] process_one_work+0x1ac/0x318
> [ 7.761226] [<ffffff80080d3610>] worker_thread+0x48/0x420
> [ 7.761233] [<ffffff80080d92d4>] kthread+0xfc/0x128
> [ 7.761239] [<ffffff8008082ec0>] ret_from_fork+0x10/0x50
> [ 7.761245] Code: b900bfbf f9405ba0 d2844c01 8b010001 (c8dffc23)
> [ 7.761251] ---[ end trace 7165822c665b3e66 ]---
> --------------------------------------------------------------------------------------
>
> I can't dump Hikey's panic memory for now, so I am not very sure this is the same tty driver issue found on virtual machine.
> So I tried many times, the oops took place by chance, like what has happened on virtual machine.
> I tried to put msleep() in tty_open, then in system startup/shutdown, the serial tty on Hikey can oops for a few input characters, as on virtual machine, too.
> Beside, it needs much more time to trigger tty oops on kernel 4.1, and I didn't saw this happen on kernel 3.18.
>
> So I am afraid this is not a virtual machine problem.
> It may because serial port is not frequently used by general computer users, so this problem is not discovered

The serial port has been used for decades very frequently :)

I don't remember anymore, did you have a proposed patch/fix for this?

thanks,

greg k-h