Re: general protection fault in show_timer

From: Dmitry Vyukov
Date: Thu Nov 30 2017 - 06:38:43 EST


On Thu, Nov 30, 2017 at 12:31 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> On Thu, Nov 30, 2017 at 12:08 PM, Alexey Dobriyan <adobriyan@xxxxxxxxx> wrote:
>> On 11/30/17, syzbot
>> <bot+054c6cd125793643a90db21e4b9ddc71a881f797@xxxxxxxxxxxxxxxxxxxxxxxxx>
>> wrote:
>>> Hello,
>>>
>>> syzkaller hit the following crash on
>>> 43570f0383d6d5879ae585e6c3cf027ba321546f
>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>>> compiler: gcc (GCC) 7.1.1 20170620
>>> .config is attached
>>> Raw console output is attached.
>>>
>>> Unfortunately, I don't have any reproducer for this bug yet.
>>>
>>>
>>> kasan: CONFIG_KASAN_INLINE enabled
>>> kasan: GPF could be caused by NULL-ptr deref or user memory access
>>> general protection fault: 0000 [#1] SMP KASAN
>>> Dumping ftrace buffer:
>>> (ftrace buffer empty)
>>> Modules linked in:
>>> CPU: 1 PID: 22618 Comm: syz-executor4 Not tainted 4.15.0-rc1+ #199
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>>> Google 01/01/2011
>>> task: ffff8801c048c400 task.stack: ffff8801cd968000
>>> RIP: 0010:show_timer+0x1c7/0x2b0 fs/proc/base.c:2274
>>> RSP: 0018:ffff8801cd96f9e0 EFLAGS: 00010006
>>> RAX: dffffc0000000000 RBX: ffff8801cff22e40 RCX: ffffffff81ccb88e
>>> RDX: 0000000030a68524 RSI: ffffc90002dea000 RDI: 0000000185342920
>>> RBP: ffff8801cd96fa10 R08: ffffed003a0514c5 R09: ffffed003a0514c5
>>> R10: ffff8801c048c400 R11: ffffed003a0514c4 R12: 0000000040000000
>>> R13: ffff8801c79e7000 R14: ffffffff853419e0 R15: 0000000000000507
>>> FS: 00007f19f811e700(0000) GS:ffff8801db500000(0000)
>>> knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00007f4ab586bdb8 CR3: 00000001d1788000 CR4: 00000000001426e0
>>> DR0: 0000000020000000 DR1: 0000000020000008 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
>>> Call Trace:
>>> traverse+0x248/0xa00 fs/seq_file.c:111
>>> seq_read+0x96a/0x13d0 fs/seq_file.c:189
>>> do_loop_readv_writev fs/read_write.c:673 [inline]
>>> do_iter_read+0x3db/0x5b0 fs/read_write.c:897
>>> vfs_readv+0x121/0x1c0 fs/read_write.c:959
>>> do_preadv+0x11b/0x1a0 fs/read_write.c:1043
>>> SYSC_preadv fs/read_write.c:1093 [inline]
>>> SyS_preadv+0x30/0x40 fs/read_write.c:1088
>>> entry_SYSCALL_64_fastpath+0x1f/0x96
>>> RIP: 0033:0x4529d9
>>> RSP: 002b:00007f19f811dc58 EFLAGS: 00000212 ORIG_RAX: 0000000000000127
>>> RAX: ffffffffffffffda RBX: 00007f19f811d950 RCX: 00000000004529d9
>>> RDX: 0000000000000001 RSI: 00000000205e2ff0 RDI: 0000000000000013
>>> RBP: 00007f19f811d940 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0001000000000000 R11: 0000000000000212 R12: 00000000004b7346
>>> R13: 00007f19f811dac8 R14: 00000000004b7351 R15: 0000000000000000
>>> Code: 89 c7 4c 0f 44 f1 41 83 e4 fb 4d 63 e4 e8 a2 2f a3 ff 4a 8d 3c e5 20
>>>
>>> 29 34 85 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00
>>> 0f 85 a2 00 00 00 4a 8b 14 e5 20 29 34 85 4c 89 ef
>>> RIP: show_timer+0x1c7/0x2b0 fs/proc/base.c:2274 RSP: ffff8801cd96f9e0
>>
>> This looks like ASAN problem
>>
>> ffffffff81a5f1d9: e8 72 74 b7 ff call
>> ffffffff815d6650 <__sanitizer_cov_trace_pc>
>> ffffffff81a5f1de: 4a 8d 3c e5 a0 9e b4 lea rdi,[r12*8-0x7b4b6160]
>> ffffffff81a5f1e5: 84
>> ffffffff81a5f1e6: 48 b8 00 00 00 00 00 movabs rax,0xdffffc0000000000
>> ffffffff81a5f1ed: fc ff df
>> ffffffff81a5f1f0: 48 89 fa mov rdx,rdi
>> ffffffff81a5f1f3: 48 c1 ea 03 shr rdx,0x3
>> ffffffff81a5f1f7: 80 3c 02 00 ===>cmp BYTE PTR
>> [rdx+rax*1],0x0 <====
>>
>> This is code injected by KASAN_INLINE
>>
>> timer_show() looks seemingly fine:
>> "timer" pointer is valid otherwise code would oopsed earlier
>> tp->ns is valid otherwise it'd oopsed inside pid_nr_ns()
>
>
> This is not KASAN bug. Kernel tries to dereference 0x0000000185342920.
> Failure mode is just different.


Looking at code and disass:


seq_printf(m, "notify: %s/%s.%d\n",
ffffffff81ccb88e: 4a 8d 3c e5 20 29 34 lea -0x7acbd6e0(,%r12,8),%rdi
ffffffff81ccb895: 85
ffffffff81ccb896: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
ffffffff81ccb89d: fc ff df
ffffffff81ccb8a0: 48 89 fa mov %rdi,%rdx
ffffffff81ccb8a3: 48 c1 ea 03 shr $0x3,%rdx
ffffffff81ccb8a7: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
ffffffff81ccb8ab: 0f 85 a2 00 00 00 jne
ffffffff81ccb953 <show_timer+0x273>
ffffffff81ccb8b1: 4a 8b 14 e5 20 29 34 mov -0x7acbd6e0(,%r12,8),%rdx
ffffffff81ccb8b8: 85
ffffffff81ccb8b9: 4c 89 ef mov %r13,%rdi
ffffffff81ccb8bc: 45 89 f8 mov %r15d,%r8d
ffffffff81ccb8bf: 4c 89 f1 mov %r14,%rcx
ffffffff81ccb8c2: 48 c7 c6 a0 1a 34 85 mov $0xffffffff85341aa0,%rsi
ffffffff81ccb8c9: e8 22 5c eb ff callq
ffffffff81b814f0 <seq_printf>


static int show_timer(struct seq_file *m, void *v)
{
struct k_itimer *timer;
struct timers_private *tp = m->private;
int notify;
static const char * const nstr[] = {
[SIGEV_SIGNAL] = "signal",
[SIGEV_NONE] = "none",
[SIGEV_THREAD] = "thread",
};

timer = list_entry((struct list_head *)v, struct k_itimer, list);
notify = timer->it_sigev_notify;

seq_printf(m, "ID: %d\n", timer->it_id);
seq_printf(m, "signal: %d/%p\n",
timer->sigq->info.si_signo,
timer->sigq->info.si_value.sival_ptr);
seq_printf(m, "notify: %s/%s.%d\n",
nstr[notify & ~SIGEV_THREAD_ID],
(notify & SIGEV_THREAD_ID) ? "tid" : "pid",
pid_nr_ns(timer->it_pid, tp->ns));
seq_printf(m, "ClockID: %d\n", timer->it_clock);

return 0;
}


it seems that notify is equal to 0x0000000040000000 and this makes
nstr[notify & ~SIGEV_THREAD_ID] a totally wild access.