Re: strace lockup when tracing exec in go

From: Mike Galbraith
Date: Thu Sep 22 2016 - 00:15:15 EST


On Wed, 2016-09-21 at 17:29 +0200, Michal Hocko wrote:
> [I am CCing strace mailing list because even if this turns out to be a
> kernel bug strace seems to be doing something unexpected - more on that
> below]
>
> Hi,
> Aleksa has reported the following lockup when stracing the following go
> program
>
> % cat exec.go
> package main
>
> import (
> "os"
> "syscall"
> )
>
> func main() {
> syscall.Exec("/bin/echo", []string{"/bin/echo", "Hello"}, os.Environ())
> }
> $ go version
> go version go1.6.3 linux/amd64
> $ go build -o exec exec.go
>
> $ strace -f ./exec
> [snip]
> [pid 10349] select(0, NULL, NULL, NULL, {0, 100}
> [pid 10346] <... select resumed> ) = 0 (Timeout)
> [pid 10346] select(0, NULL, NULL, NULL, {0, 20}
> [pid 10345] execve("/bin/echo", ["/bin/echo", "Hello"], [/* 95 vars */]
> [pid 10346] <... select resumed> ) = 0 (Timeout)
> [pid 10349] <... select resumed> ) = 0 (Timeout)
>
> execve will never finish unless the strace process get killed with
> SIGKILL. The following was debugged on 3.12 kernel and the current
> kernel seems to not trigger the issue as easily but I believe the same
> problem is there as well.
>
> The further investigation shown that the tracer (strace) is stuck
> waiting for cred_guard_mutex
> [<0000000000000000>] mm_access+0x22/0xa0
> [<0000000000000000>] process_vm_rw_core.isra.1+0x112/0x6c0
> [<0000000000000000>] process_vm_rw+0xab/0x110
> [<0000000000000000>] SyS_process_vm_readv+0x15/0x20
> [<0000000000000000>] system_call_fastpath+0x16/0x1b
> [<0000000000000000>] 0x7f186f031c3a
> [<0000000000000000>] 0xffffffffffffffff
>
> while the traced process (3 threads) are trying to perform the exec.
> That means that 2 threads are dead (in fact zombies) waiting in their
> final schedule.
> Call Trace:
> [] do_exit+0x6f7/0xa70
> [] do_group_exit+0x39/0xa0
> [] get_signal_to_deliver+0x1e8/0x5c0
> [] do_signal+0x42/0x670
> [] do_notify_resume+0x78/0xc0
> [] int_signal+0x12/0x17
> [<00007f3a33f3ffb9>] 0x7f3a33f3ffb8
>
> and one is
> Call Trace:
> [] flush_old_exec+0xdf/0x890
> [] load_elf_binary+0x307/0xda0
> [] search_binary_handler+0xae/0x1f0
> [] do_execve_common.isra.26+0x64e/0x810
> [] SyS_execve+0x31/0x50
> [] stub_execve+0x69/0xa0
> [<00007f3a33f16527>] 0x7f3a33f16526
>
> waiting for notify_count to drop down to 0
> while (sig->notify_count) {
> __set_current_state(TASK_KILLABLE);
> spin_unlock_irq(lock);
> schedule(); <<<<<<<<<
> if (unlikely(__fatal_signal_pending(tsk)))
> goto killed;
> spin_lock_irq(lock);
> }
>
> this however doesn't happen because both threads which are dead
> are zombies waiting to be reaped by the parent and to call their
> release_task->__exit_signal. The tracer is blocked on the lock held by
> exec (in prepare_bprm_creds). This is the case in the current kernel as
> well AFAICS so the same should be possible as well. So is this a bug or
> something somewhere else makes sure that this will not happen in newer
> kernels?

master.today...

crash> bt 5138
PID: 5138 TASK: ffff8803fb4e4c80 CPU: 6 COMMAND: "strace"
#0 [ffff88038e323b68] __schedule at ffffffff81624660
#1 [ffff88038e323bb0] schedule at ffffffff81624b35
#2 [ffff88038e323bc8] schedule_preempt_disabled at ffffffff81624e0e
#3 [ffff88038e323bd8] __mutex_lock_killable_slowpath at ffffffff81625f9d
#4 [ffff88038e323c30] mutex_lock_killable at ffffffff8162605a
#5 [ffff88038e323c48] mm_access at ffffffff8105bbf7
#6 [ffff88038e323c70] process_vm_rw_core at ffffffff811823d3
#7 [ffff88038e323dc0] process_vm_rw at ffffffff8118287d
#8 [ffff88038e323f38] sys_process_vm_readv at ffffffff811829b9
#9 [ffff88038e323f50] entry_SYSCALL_64_fastpath at ffffffff81628736
RIP: 00007faee6b2ac3a RSP: 00007ffdc95b62b8 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 00000000019e7030 RCX: 00007faee6b2ac3a
RDX: 0000000000000001 RSI: 00007ffdc95b62d0 RDI: 0000000000001418
RBP: 0000000000001414 R8: 0000000000000001 R9: 0000000000000000
R10: 00007ffdc95b62e0 R11: 0000000000000246 R12: 00007faee73e66c0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: 0000000000000136 CS: 0033 SS: 002b
crash> bt 5140
PID: 5140 TASK: ffff8803e703b300 CPU: 4 COMMAND: "exec"
#0 [ffff88038e0b3ce8] __schedule at ffffffff81624660
#1 [ffff88038e0b3d30] schedule at ffffffff81624b35
#2 [ffff88038e0b3d48] flush_old_exec at ffffffff811b1f26
#3 [ffff88038e0b3d88] load_elf_binary at ffffffff811fd70f
#4 [ffff88038e0b3e60] search_binary_handler at ffffffff811b13ac
#5 [ffff88038e0b3e98] do_execveat_common at ffffffff811b2f8f
#6 [ffff88038e0b3f00] sys_execve at ffffffff811b33ba
#7 [ffff88038e0b3f28] do_syscall_64 at ffffffff8100194e
RIP: 000000000045c1ef RSP: 000000c820043e00 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 000000c82000e6c0 RCX: 000000000045c1ef
RDX: 000000c820076380 RSI: 000000c82000e6c0 RDI: 000000c82000a400
RBP: 000000c820076380 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000006b
R13: 00000000004c4b40 R14: 0000000000003fff R15: 0000000000000039
ORIG_RAX: 000000000000003b CS: 0033 SS: 002b
crash> bt 5141
PID: 5141 TASK: ffff8803b0221980 CPU: 0 COMMAND: "exec"
#0 [ffff88038e33fc60] __schedule at ffffffff81624660
#1 [ffff88038e33fca8] schedule at ffffffff81624b35
#2 [ffff88038e33fcc0] do_exit at ffffffff81061292
#3 [ffff88038e33fd28] do_group_exit at ffffffff8106247f
#4 [ffff88038e33fd58] get_signal at ffffffff8106c7fc
#5 [ffff88038e33fdf0] do_signal at ffffffff81018468
#6 [ffff88038e33fef8] exit_to_usermode_loop at ffffffff81059820
#7 [ffff88038e33ff28] do_syscall_64 at ffffffff81001a35
RIP: 00000000004538d3 RSP: 000000c820041f38 RFLAGS: 00000246
RAX: 0000000000000000 RBX: 000000c820041f70 RCX: 00000000004538d3
RDX: 0000000000000000 RSI: 000000c820041f70 RDI: 0000000000000000
RBP: ffffffffffffffff R8: 000000c820032000 R9: 000000c820000480
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000042aa30
R13: 00000000004c4bc4 R14: 0000000000000000 R15: 0000000000000008
ORIG_RAX: 0000000000000083 CS: 0033 SS: 002b
crash>