Re[2]: pty_chars_in_buffer NULL pointer (kernel oops)

From: nuclearcat
Date: Sun Feb 27 2005 - 09:54:15 EST


Dear, Marcelo.

You wrote Saturday, February 26, 2005, 1:04:32 AM:

Sorry about delay, i had switched kernel to non-SMP mode.
I cannot debug on kernel (it is loaded VPN server, and there is no
redundancy for now).
I have only few old oopses, saved before (it is on old redhat kernel)
Feb 16 06:44:41 nss kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
Feb 16 06:44:41 nss kernel: printing eip:
Feb 16 06:44:41 nss kernel: 00000000
Feb 16 06:44:41 nss kernel: *pde = 00000000
Feb 16 06:44:41 nss kernel: Oops: 0000
Feb 16 06:44:41 nss kernel: cls_u32 sch_sfq sch_cbq ip_nat_ftp ip_conntrack_ftp tun ipt_REJECT ipt_REDIRECT nls_iso8859-1 loop cipcb ip_gre ipip ppp_async pp
Feb 16 06:44:41 nss kernel: CPU: 3
Feb 16 06:44:41 nss kernel: EIP: 0060:[<00000000>] Not tainted
Feb 16 06:44:41 nss kernel: EFLAGS: 00010286
Feb 16 06:44:41 nss kernel:
Feb 16 06:44:41 nss kernel: EIP is at [unresolved] (2.4.20-20.9smp)
Feb 16 06:44:41 nss kernel: eax: d4b26000 ebx: ce7fe000 ecx: c01997c0 edx: ef7c6b80
Feb 16 06:44:41 nss kernel: esi: 00000000 edi: ce7fe000 ebp: e040a880 esp: cfdf7ee0
Feb 16 06:44:41 nss kernel: ds: 0068 es: 0068 ss: 0068
Feb 16 06:44:41 nss kernel: Process pptpctrl (pid: 15960, stackpage=cfdf7000)
Feb 16 06:44:41 nss kernel: Stack: c019d839 d4b26000 00000000 c019b2e6 ce7fe000 ce7fe974 cfdf7f48 ce7fe000
Feb 16 06:44:41 nss kernel: e040a880 00000004 00000010 c0197a15 ce7fe000 e040a880 00000000 00000000
Feb 16 06:44:41 nss kernel: e040a880 c01662b7 e040a880 00000000 cfdf6000 00000145 cfdf6000 00001962
Feb 16 06:44:41 nss kernel: Call Trace: [<c019d839>] pty_chars_in_buffer [kernel] 0x39 (0xcfdf7ee0))
Feb 16 06:44:41 nss kernel: [<c019b2e6>] normal_poll [kernel] 0x106 (0xcfdf7eec))
Feb 16 06:44:41 nss kernel: [<c0197a15>] tty_poll [kernel] 0x85 (0xcfdf7f0c))
Feb 16 06:44:41 nss kernel: [<c01662b7>] do_select [kernel] 0x247 (0xcfdf7f24))
Feb 16 06:44:41 nss kernel: [<c016663e>] sys_select [kernel] 0x34e (0xcfdf7f60))
Feb 16 06:44:41 nss kernel: [<c01098cf>] system_call [kernel] 0x33 (0xcfdf7fc0))
Feb 16 06:44:41 nss kernel:
Feb 16 06:44:41 nss kernel:
Feb 16 06:44:41 nss kernel: Code: Bad EIP value.



in new kernel there is no debug messages to find where is problem, but
problem looks very similar

Feb 17 13:13:54 nss kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
Feb 17 13:13:54 nss kernel: printing eip:
Feb 17 13:13:54 nss kernel: 00000000
Feb 17 13:13:54 nss kernel: *pde = 00000000
Feb 17 13:13:54 nss kernel: Oops: 0000
Feb 17 13:13:54 nss kernel: CPU: 3
Feb 17 13:13:54 nss kernel: EIP: 0010:[<00000000>] Not tainted
Feb 17 13:13:54 nss kernel: EFLAGS: 00010286
Feb 17 13:13:54 nss kernel: eax: ec32e000 ebx: f6891000 ecx: c01f56a0 edx: d6547980
Feb 17 13:13:54 nss kernel: esi: f6891000 edi: 00000000 ebp: f39c4c00 esp: d66bbed8
Feb 17 13:13:54 nss kernel: ds: 0018 es: 0018 ss: 0018
Feb 17 13:13:54 nss kernel: Process pptpctrl (pid: 10632, stackpage=d66bb000)
Feb 17 13:13:54 nss kernel: Stack: c01f9829 ec32e000 00000000 c01f7e66 f6891000 00000010 00000202 f68910c0
Feb 17 13:13:54 nss kernel: f6891000 f39c4c00 00000000 c01f3bb0 f6891000 f39c4c00 00000000 00000000
Feb 17 13:13:54 nss kernel: f39c4c00 00000004 00000010 c0153a87 f39c4c00 00000000 d66ba000 00000145
Feb 17 13:13:54 nss kernel: Call Trace: [<c01f9829>] [<c01f7e66>] [<c01f3bb0>] [<c0153a87>] [<c0153e0e>]
Feb 17 13:13:54 nss kernel: [<c010ae99>] [<c0108f67>]
Feb 17 13:13:54 nss kernel:
Feb 17 13:13:54 nss kernel: Code: Bad EIP value.


And problem disappearing in non-SMP kernel.


> Hi,

> On Fri, Feb 18, 2005 at 10:56:53AM +0200, nuclearcat wrote:

>> Is discussed at

>> http://kerneltrap.org/mailarchive/1/message/12508/thread

>> bug fixed in 2.4.x tree? Cause seems i have downloaded 2.4.29, and it
>> is not fixed (still my kernel on vpn server crashing almost at start),
>> i have grepped fast pre and bk patches, but didnt found any fixed
>> related to tty/pty.

> Can you please post the oops? Have you done so already?

> What makes you think it is the same race discussed in the above thread?

> BTW, I fail to see any drivers/char/pty.c change related to the race which triggers
> the pty_chars_in_buffer->0 oops.

> Quoting the first message from thread you mention:
> "That last call trace entry is the call in pty_chars_in_buffer() to

> /* The ldisc must report 0 if no characters available to be read */
> count = to->ldisc.chars_in_buffer(to);
> "

> Alan, Linus, what correction to the which the above thread discusses has
> been deployed?

>> Provided in thread patch from Linus working, but after night i have
>> checked server, and see load average jumped to 700.
>> Can anybody help in that? I am not kernel guru to provide a patch, but
>> seems by search in google it is actual problem for people, who own
>> poptop vpn servers, it is really causing serious instability for
>> servers.

> Can you compile a list of such v2.4 reports?
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/



--
With best regards,
GlobalProof Globax Division Manager,
Denys Fedoryshchenko
mailto:denys@xxxxxxxxxxxxxxx

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/