Re: 2.6.22.1 Oops in put_nfs_open_context

From: Andrew Morton
Date: Mon Aug 06 2007 - 14:01:55 EST


On Mon, 6 Aug 2007 11:08:13 +0100 "Dr. David Alan Gilbert" <linux@xxxxxxxxxxx> wrote:

> The oops below is from one of a pair of machines that run compiles;
> they're not managing to stay up for more than a day or two at a time
> this is the first time I've actually managed to capture an oops from one.
> They lock to the point where they still ping, and they won't toggle
> capslock. A top left running on them showed it sitting with pdflush
> using 99% CPU.
>
> Config at the bottom. The hardware are supermicro X7DVA boards with
> 2x Xeon 5140's. (These Supermicro bios don't appear to have the PCI-Express
> coalesce option being discussed in another thread).
>
> Dave
>
> Aug 3 19:15:41 fel kernel: [185427.633686] BUG: unable to handle kernel paging request at virtual address 00100104
> Aug 3 19:15:41 fel kernel: [185427.633691] printing eip:
> Aug 3 19:15:41 fel kernel: [185427.633693] c01fe613
> Aug 3 19:15:41 fel kernel: [185427.633694] *pde = 00000000
> Aug 3 19:15:41 fel kernel: [185427.633697] Oops: 0002 [#1]
> Aug 3 19:15:41 fel kernel: [185427.633705] SMP
> Aug 3 19:15:41 fel kernel: [185427.633712] Modules linked in: netconsole
> Aug 3 19:15:41 fel kernel: [185427.633721] CPU: 2 Aug 3 19:15:41 fel kernel: [185427.633722] EIP: 0060:[<c01fe613>] Not tainted VLI
> Aug 3 19:15:41 fel kernel: [185427.633723] EFLAGS: 00010296 (2.6.22.1daveg #3) Aug 3 19:15:41 fel kernel: [185427.633735] EIP is at put_nfs_open_context+0x30/0x7d
> Aug 3 19:15:41 fel kernel: [185427.633739] eax: 00100100 ebx: d299b634 ecx: 00000001 edx: 00200200 Aug 3 19:15:41 fel kernel: [185427.633744] esi: cd4c7240 edi: cd4c7260 ebp: e2a25d30 esp: e2a25d24
> Aug 3 19:15:42 fel kernel: [185427.633748] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Aug 3 19:15:43 fel kernel: [185427.633752] Process ccache (pid: 8352, ti=e2a24000 task=f69c15b0 task.ti=e2a24000)
> Aug 3 19:15:43 fel kernel: [185427.633756] Stack: e469b680 d299b57c 00000029 e2a25d3c c02014dd 00000000 e2a25d68 c020475f Aug 3 19:15:43 fel kernel: [185427.633774] 00000001 00000000 00000000 ffffffff d299b4ac e469b680 d299b57c 00000000
> Aug 3 19:15:43 fel kernel: [185427.633791] 00000001 e2a25da0 c0205a2d 00000000 00000000 00000000 00000000 d299b4ac Aug 3 19:15:44 fel kernel: [185427.633809] Call Trace:
> Aug 3 19:15:44 fel kernel: [185427.633813] [<c0104034>] show_trace_log_lvl+0x19/0x2e
> Aug 3 19:15:46 fel kernel: [185427.633821] [<c01040fe>] show_stack_log_lvl+0xa1/0xa9
> Aug 3 19:15:47 fel kernel: [185427.633827] [<c0104301>] show_registers+0x1b8/0x289
> Aug 3 19:15:47 fel kernel: [185427.633833] [<c0104524>] die+0x10d/0x1d2
> Aug 3 19:15:47 fel kernel: [185427.633839] [<c0431938>] do_page_fault+0x44d/0x524
> Aug 3 19:15:47 fel kernel: [185427.633847] [<c043018a>] error_code+0x72/0x78
> Aug 3 19:15:47 fel kernel: [185427.633852] [<c02014dd>] nfs_release_request+0x20/0x2f
> Aug 3 19:15:47 fel kernel: [185427.633859] [<c020475f>] nfs_wait_on_requests_locked+0x6d/0xae
> Aug 3 19:15:47 fel kernel: [185427.633866] [<c0205a2d>] nfs_sync_mapping_wait+0x82/0x11b
> Aug 3 19:15:47 fel kernel: [185427.633872] [<c0205b23>] nfs_wb_all+0x5d/0x7b
> Aug 3 19:15:47 fel kernel: [185427.633878] [<c01fc93b>] nfs_rename+0x16b/0x275
> Aug 3 19:15:47 fel kernel: [185427.633884] [<c0168c99>] vfs_rename_other+0x65/0xaf
> Aug 3 19:15:47 fel kernel: [185427.633891] [<c0168dd0>] vfs_rename+0xed/0x20b
> Aug 3 19:15:47 fel kernel: [185427.633896] [<c016902f>] do_rename+0x141/0x182
> Aug 3 19:15:47 fel kernel: [185427.633902] [<c01690ab>] sys_renameat+0x3b/0x5d
> Aug 3 19:15:47 fel kernel: [185427.633907] [<c01690f5>] sys_rename+0x28/0x2a
> Aug 3 19:15:47 fel kernel: [185427.633913] [<c01032a6>] sysenter_past_esp+0x5f/0x85
> Aug 3 19:15:47 fel kernel: [185427.633919] =======================
> Aug 3 19:15:47 fel kernel: [185427.633922] Code: 89 c6 53 f0 ff 08 0f 94 c0 84 c0 74 66 8d 7e 20 39 7e 20 74 30 8b 46 08 8b 58 24 83 c3 6c 89 d8 e8 4c 17 23 00 8b 46 20 8b 57 04 <89> 50 04 89 02 89 d8 c7 46 20 00 01 10 00 c7 47 04 00 02 20 00
> Aug 3 19:15:47 fel kernel: [185427.634022] EIP: [<c01fe613>] put_nfs_open_context+0x30/0x7d SS:ESP 0068:e2a25d24

We did a list operation on an already-deleted list_head.

That code has changed a lot between 2..22 and 2.6.23-rc2. Hopefully for the
better, although I can't immediately find a commit in there which looks like
it addresses this bug.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/