Re: [PATCH v3 1/3] kexec: Move vmcoreinfo out of the kernel's .bss section

From: Michael Holzheu
Date: Wed Mar 22 2017 - 16:48:45 EST


Am Wed, 22 Mar 2017 12:30:04 +0800
schrieb Dave Young <dyoung@xxxxxxxxxx>:

> On 03/21/17 at 10:18pm, Eric W. Biederman wrote:
> > Dave Young <dyoung@xxxxxxxxxx> writes:
> >

[snip]

> > > I think makedumpfile is using it, but I also vote to remove the
> > > CRASHTIME. It is better not to do this while crashing and a makedumpfile
> > > userspace patch is needed to drop the use of it.
> > >
> > >>
> > >> As we are looking at reliability concerns removing CRASHTIME should make
> > >> everything in vmcoreinfo a boot time constant. Which should simplify
> > >> everything considerably.
> > >
> > > It is a nice improvement..
> >
> > We also need to take a close look at what s390 is doing with vmcoreinfo.
> > As apparently it is reading it in a different kind of crashdump process.
>
> Yes, need careful review from s390 and maybe ppc64 especially about
> patch 2/3, better to have comments from IBM about s390 dump tool and ppc
> fadump. Added more cc.

On s390 we have at least an issue with patch 1/3. For stand-alone dump
and also because we create the ELF header for kdump in the new
kernel we save the pointer to the vmcoreinfo note in the old kernel on a
defined memory address in our absolute zero lowcore.

This is done in arch/s390/kernel/setup.c:

static void __init setup_vmcoreinfo(void)
{
mem_assign_absolute(S390_lowcore.vmcore_info, paddr_vmcoreinfo_note());
}

Since with patch 1/3 paddr_vmcoreinfo_note() returns NULL at this point in
time we have a problem here.

To solve this - I think - we could move the initialization to
arch/s390/kernel/machine_kexec.c:

void arch_crash_save_vmcoreinfo(void)
{
VMCOREINFO_SYMBOL(lowcore_ptr);
VMCOREINFO_SYMBOL(high_memory);
VMCOREINFO_LENGTH(lowcore_ptr, NR_CPUS);
mem_assign_absolute(S390_lowcore.vmcore_info, paddr_vmcoreinfo_note());
}

Probably related to this is my observation that patch 3/3 leads to
an empty VMCOREINFO note for kdump on s390. The note is there ...

# readelf -n /var/crash/127.0.0.1-2017-03-22-21:14:39/vmcore | grep VMCORE
VMCOREINFO 0x0000068e Unknown note type: (0x00000000)

But it contains only zeros.

Unfortunately I have not yet understood the reason for this.

Michael