Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

From: Fengguang Wu
Date: Tue Nov 07 2017 - 11:46:51 EST


On Tue, Nov 07, 2017 at 08:25:03AM -0800, Linus Torvalds wrote:
On Tue, Nov 7, 2017 at 2:21 AM, Fengguang Wu <fengguang.wu@xxxxxxxxx> wrote:

FYI this happens in v4.14-rc8 -- it's not necessarily a new bug.

Probably not.

Looks like a use-after-free bug in vlan_device_event() judging by the
base pointer:

ECX: 6b6b6b6b

this is one of those circumstances where having the faddr2line output
for that EIP would make it much easier to see exactly which access it
is that causes problems. There's lots of inlining going on, so without
that it's a pain to figure out.

The code is

0: 31 c0 xor %eax,%eax
2: 8d 76 00 lea 0x0(%esi),%esi
5: 89 c2 mov %eax,%edx
7: 89 c3 mov %eax,%ebx
9: 81 e2 ff 0f 00 00 and $0xfff,%edx
f: 89 d1 mov %edx,%ecx
11: c1 fb 0c sar $0xc,%ebx
14: c1 e9 09 shr $0x9,%ecx
17: 8d 0c d9 lea (%ecx,%ebx,8),%ecx
1a: 8b 4c 8e 10 mov 0x10(%esi,%ecx,4),%ecx
1e: 85 c9 test %ecx,%ecx
20: 74 34 je 0x56
22: 81 e2 ff 01 00 00 and $0x1ff,%edx
28:* 8b 14 91 mov (%ecx,%edx,4),%edx <-- trapping instruction
2b: 85 d2 test %edx,%edx
2d: 74 27 je 0x56
2f: f6 82 30 01 00 00 01 testb $0x1,0x130(%edx)
36: 74 1e je 0x56

and just by going by the constants in question (0xfff and 0x1ff), I
can see that it's one of

vlan_group_for_each_dev(..) {
...
}

things, but that's pretty much all I can tell.

Apparently we'll get that faddr2line output soon. In the meantime, I
think this is a real bug report but I don't see enough information to
really go on.

Got it. I should be able to get faddr2line output tomorrow.

Of course, if it's bisectable, that would be great too.

It looks reproducible enough to be bisectable. I'll try.

Regards,
Fengguang