Re: Brocken/incomplete `/proc/vmcore`

From: Bhupesh Sharma
Date: Mon Aug 19 2019 - 15:22:39 EST


Hi Paul,

On Mon, Aug 19, 2019 at 1:59 PM Paul Menzel <pmenzel@xxxxxxxxxxxxx> wrote:
>
> Dear Linux folks,
>
>
> Using Linux 4.19.57 (configuration attached), crashing the system, and
> starting it using the same Linux kernel as crash kernel, the available
> `/proc/vmcore` seems to be incomplete.
>
> Running GDB commands, working with `/proc/kcore`, do not work with
> `/proc/vmcore`, and the addresses are not there.
>
> In the running system, iterating through the tasks works.
>
> ```
> macro define offsetof(type, member) ((size_t)(&((type *)0)->member))
> macro define container_of(ptr,type,member) ((type *)((size_t)ptr-offsetof(type,member)))
> ```
>
> ### /proc/kcore ###
>
> ```
> Core was generated by `BOOT_IMAGE=/boot/bzImage-4.19.57.mx64.286 root=LABEL=root ro crashkernel=512M c'.
> #0 0x0000000000000000 in irq_stack_union ()
> (gdb) source gdb-macros.txt
> (gdb) set $t=&init_task
> (gdb) print $t->tasks
> $1 = {next = 0xffff889ffbb0f080, prev = 0xffff88bff9b09300}
> (gdb) print $t->pid
> $2 = 0
> (gdb) set $t=container_of($t->tasks->next,struct task_struct,tasks)
> (gdb) print $t->tasks
> $3 = {next = 0xffff889ffbb0e340, prev = 0xffffffff82411a80 <init_task+768>}
> (gdb) print $t->pid
> $4 = 1
> (gdb) set $t=container_of($t->tasks->next,struct task_struct,tasks)
> (gdb) print $t->tasks
> $5 = {next = 0xffff889ffbb530c0, prev = 0xffff889ffbb0f080}
> (gdb) print $t->pid
> $6 = 2
> ```
>
> ### /proc/vmcore ###
>
> After the crash by SysRQ trigger, values in `/proc/vmcore` are incorrect.
>
> ```
> (gdb) set $t=&init_task
> (gdb) print $t->tasks
> $1 = {next = 0xffff889ffbb0f080, prev = 0xffff88bff9b09300}
> (gdb) print $t->pid
> $2 = 0
> (gdb) set $t=container_of($t->tasks->next,struct task_struct,tasks)
> (gdb) print $t->tasks
> $3 = {next = 0x0 <irq_stack_union>, prev = 0x0 <irq_stack_union>}
> (gdb) print $t->pid
> $4 = 0
> ```
>
> We can reproduce this in a virtual machine and on a big server.

Looking at the attached config file it seems the underlying arch is
x86_64, but there are a few things missing from your email which can
help suggest solutions better:

1. Can you please share bootargs provided to the kdump kernel,
2. Please share the kexec-tools version that you are using:
$ kexec --version
3. Do you notice any specific warning/error messages on the console
when the second (kdump) kernel executes - better still if you can
share a snippet of the second kernel's console messages - it will
further help in suggesting debug points for this issue.

Thanks,
Bhupesh