Re: [PATCH] exit: dump thread info on global init exit

From: chenqiwu
Date: Thu Nov 09 2023 - 21:27:28 EST


On Thu, Nov 09, 2023 at 12:01:30PM +0100, Oleg Nesterov wrote:
> I've just noticed we discuss this offlist. Add lkml...
>
> On 11/09, chenqiwu wrote:
> >
> > On Wed, Nov 08, 2023 at 10:57:32AM +0100, Oleg Nesterov wrote:
> >
> > > > + if (mmap_read_lock_killable(mm)) {
> > >
> > > why do you need _killable ?
> > >
> > I'm not sure which type lock (killable or unkillable) should be used here
>
> killable should be used to allow to kill the task which waits for this lock.
> Who can kill the global init? Yes it is possible (but very unlikely) that
> fatal_signal_pending() is true, but I don't think this was your concern.
>
> > if there is a lock contention, perhaps using down_read_trylock is better.
>
> Perhaps. If we have another bug mmap_read_lock() can hang forever.
>
Yes, but we really don't want to hang here forever if we cannot get the mmap read
lock. I think using down_read_trylock should be better, I wiil resend the patch as
V2 cc LKML to discuss the thing.

> > > > +static void dump_thread_info(struct task_struct *tsk)
> > > > +{
> > > > + struct pt_regs *regs = task_pt_regs(tsk);
> > > > +
> > > > + if (user_mode(regs))
> > > > + dump_thread_maps_info(tsk);
> > > > + show_regs(regs);
> > >
> > > This looks confusing to me...
> > >
> > > How can user_mode() return false in this case? And even if this is
> > > possible, then show_regs() should depend on user_mode() as well?
> > > I must have missed something.
> > >
> > Sure, the last global init thread cannot be exited in non-user mode.
>
> Forgot to mention... panic() should dump the regs, so I think show_regs()
> is not needed?
>
In fact, panic don't dump the regs. For example, the current kill init panic log from Andriod system:
[ 4.706445][T400001] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f0000
[ 4.706459][T400001] CPU: 4 PID: 1 Comm: init Tainted: G S W 5.10.136-android12-9-00005-gf9a66cbe7091-ab9177899 #1
[ 4.706464][T400001] Hardware name: MTxxxx/TCZA (DT)
[ 4.706469][T400001] Call trace:
[ 4.706482][T400001] dump_backtrace.cfi_jt+0x0/0x8
[ 4.706493][T400001] dump_stack_lvl+0xc4/0x140
[ 4.706504][T400001] panic+0x178/0x464
[ 4.706511][T400001] do_exit+0xb30/0xf9c
[ 4.706517][T400001] do_group_exit+0x130/0x1c8
[ 4.706523][T400001] do_group_exit+0x0/0x1c8
[ 4.706529][T400001] __do_sys_exit_group+0x0/0x18
[ 4.706535][T400001] __se_sys_exit_group+0x0/0x14
[ 4.706543][T400001] el0_svc_common+0xd4/0x270
[ 4.706551][T400001] el0_svc+0x28/0x88
[ 4.706559][T400001] el0_sync_handler+0x8c/0xf0
[ 4.706567][T400001] el0_sync+0x1b4/0x1c0
[ 4.706576][T400001] SMP: stopping secondary CPUs

I try to create an NULL pointer in Andriod init code with my patch. From the test log,
we can see the user registers and decode pc/lr with section info easily:
[ 4.535666] <6>.(6)[1:init]init-1: Dump maps info start
[ 4.536327] <6>.(6)[1:init]5582f61000-558308e000 r-xp 000ab000 /system/bin/init
[ 4.537252] <6>.(6)[1:init]7fa7001000-7fa7002000 r-xp 00001000 /system/lib64/libsquashfs_utils.so
[ 4.538370] <6>.(6)[1:init]7fa7048000-7fa7066000 r-xp 00008000 /system/lib64/liblzma.so
[ 4.539378] <6>.(6)[1:init]7fa7081000-7fa7082000 r-xp 00001000 /system/lib64/libpackagelistparser.so
[ 4.540590] <6>.(6)[1:init]7fa70e3000-7fa7111000 r-xp 00023000 /system/lib64/libpcre2.so
[ 4.541608] <6>.(6)[1:init]7fa7153000-7fa716f000 r-xp 00013000 /system/lib64/libjsoncpp.so
[ 4.542644] <6>.(6)[1:init]7fa7182000-7fa7183000 r-xp 00002000 /system/lib64/libcgrouprc.so
[ 4.543700] <6>.(6)[1:init]7fa71c4000-7fa71c8000 r-xp 00004000 /system/lib64/libsparse.so
[ 4.544725] <6>.(6)[1:init]7fa7207000-7fa7212000 r-xp 00007000 /system/lib64/libfec.so
[ 4.545719] <6>.(6)[1:init]7fa7241000-7fa7242000 r-xp 00001000 /system/lib64/libcrypto_utils.so
[ 4.546809] <6>.(6)[1:init]7fa72d9000-7fa734c000 r-xp 00059000 /system/lib64/libcrypto.so
[ 4.547848] <6>.(6)[1:init]7fa734c000-7fa7397000 --xp 000cc000 /system/lib64/libcrypto.so
[ 4.548873] <6>.(6)[1:init]7fa7397000-7fa7399000 r-xp 00117000 /system/lib64/libbase.so
[ 4.571715] <6>.(6)[1:init]7fa858a000-7fa85a3000 r-xp 0000a000 /system/lib64/libbacktrace.so
[ 4.573583] <6>.(6)[1:init]7fa9897000-7fa997c000 r-xp 00037000 /system/bin/linker64
[ 4.574543] <6>.(6)[1:init]init-1: Dump maps info end
[ 4.575180] <6>.(6)[1:init]CPU: 6 PID: 1 Comm: init Tainted: G S W 4.19.191-02983-ga8a102dc29ef-dirty #4
[ 4.576516] <6>.(6)[1:init]Hardware name: MTxxxxV/CZ (DT)
[ 4.577198] <6>.(6)[1:init]pstate: 60001000 (nZCv daif -PAN -UAO)
[ 4.577962] <6>.(6)[1:init]pc : 0000005582fa54b4
[ 4.578542] <6>.(6)[1:init]lr : 0000005582fa54b0
[ 4.579123] <6>.(6)[1:init]sp : 0000007fd11ba7b0
[ 4.579709] <6>.(6)[1:init]x29: 0000007fd11ba7c0 x28: 63726f662e746f6f
[ 4.580540] <6>.(6)[1:init]x27: 0000007fa8712010 x26: 0000005582ed76e0
[ 4.581371] <6>.(6)[1:init]x25: 6c616d726f6e5f65 x24: 6264696f72646e61
[ 4.582201] <6>.(6)[1:init]x23: 313d746f6f625f6c x22: b400007eb700cdf7
[ 4.583030] <6>.(6)[1:init]x21: 0000007fa8712010 x20: 0000000000000002
[ 4.583867] <6>.(6)[1:init]x19: 0000007fa998b7f8 x18: 0000007fa8852000
[ 4.584697] <6>.(6)[1:init]x17: 0000007fa743cb70 x16: 0000007fa7ea97d0
[ 4.585528] <6>.(6)[1:init]x15: 0000000000000010 x14: 0000000000000000
[ 4.586358] <6>.(6)[1:init]x13: 7265746e696f7020 x12: 0000fffffffff3ff
[ 4.587187] <6>.(6)[1:init]x11: 000000000000000d x10: 0000000000000160
[ 4.588025] <6>.(6)[1:init]x9 : 591ec1e81c3d3cdb x8 : 591ec1e81c3d3cdb
[ 4.588855] <6>.(6)[1:init]x7 : 000000000000000a x6 : 000000000000000a
[ 4.589686] <6>.(6)[1:init]x5 : 0000007fd11ba1e8 x4 : 0000000000000128
[ 4.590516] <6>.(6)[1:init]x3 : 0000007fd11ba718 x2 : 0000007e5700c190
[ 4.591345] <6>.(6)[1:init]x1 : 0000007fa74d87cc x0 : 0000007fa74d87cc
[ 5.076262] <6>-(6)[1:init]Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000005
[ 5.076262] <6>
[ 5.077606] <6>-(6)[1:init]SMP: stopping secondary CPUs
[ 5.087741] <6>-(6)[1:init]Kernel Offset: 0x2f87c00000 from 0xffffff8008000000
[ 5.088647] <6>-(6)[1:init]PHYS_OFFSET: 0xffffffdd40000000
[ 5.089336] <6>-(6)[1:init]Modules linked in:
[ 5.094708] <6>-(6)[1:init]CPU: 6 PID: 1 Comm: init Tainted: G S W 4.19.191-02983-ga8a102dc29ef-dirty #4
[ 5.096037] <6>-(6)[1:init]Hardware name: MTxxxxV/CZ (DT)
[ 5.096715] <6>-(6)[1:init]Call trace:
[ 5.097191] <6>-(6)[1:init] dump_backtrace+0x0/0x158
[ 5.097815] <6>-(6)[1:init] show_stack+0x14/0x1c
[ 5.098398] <6>-(6)[1:init] dump_stack+0xb8/0xf0
[ 5.098980] <6>-(6)[1:init] mrdump_common_die+0x170/0x214
[ 5.099658] <6>-(6)[1:init] ipanic+0xa8/0xd4
[ 5.100197] <6>-(6)[1:init] atomic_notifier_call_chain+0x4c/0x84
[ 5.100952] <6>-(6)[1:init] panic+0x15c/0x2a0
[ 5.101501] <6>-(6)[1:init] dump_thread_info+0x0/0x230
[ 5.102146] <6>-(6)[1:init] do_group_exit+0x88/0x9c
[ 5.102761] <6>-(6)[1:init] get_signal+0x7d0/0x980
[ 5.103364] <6>-(6)[1:init] do_notify_resume+0x164/0x2448
[ 5.104042] <6>-(6)[1:init] work_pending+0x8/0x10

Qiwu