Re: [BUGFIX][PATCH] Freezer, CPU hotplug, x86 Microcode: Fix taskfreezing failures

From: Srivatsa S. Bhat
Date: Sun Oct 02 2011 - 16:04:52 EST


Hi,

On 10/03/2011 01:20 AM, Tejun Heo wrote:
> Hello,
>
> On Mon, Oct 03, 2011 at 12:35:04AM +0530, Srivatsa S. Bhat wrote:
>> So, this patch addresses this issue by ensuring that microcode is not freed
>> from kernel memory, nor invalidated when a CPU goes offline. Thus once the
>> kernel gets the microcode during boot-up, it will never have to depend on
>> userspace ever again to get microcode, since it never releases the copy it
>> already has. So every run of the microcode callback for CPU online event will
>> now succeed irrespective of whether userspace is frozen or not. As a result,
>> this fixes the task freezing failure encountered while running CPU hotplug
>> stress test along with suspend/resume operations simultaneously.
>
> I'm not familiar with how microcode is supposed to be managed but is
> it impossible for the newly hotplugged CPU (an actual hot unplug /
> plug) may not like the microcode loaded for the previous CPU? Isn't
> that why CPU_DEAD was invalidating the microcode?
>

Actually, looking at the code, I found that a copy of the microcode
is maintained by the kernel for every CPU.
The relevant lines are:

arch/x86/include/asm/microcode.h:

struct ucode_cpu_info {
struct cpu_signature cpu_sig;
int valid;
void *mc;
};
extern struct ucode_cpu_info ucode_cpu_info[];


arch/x86/kernel/microcode_core.h:

struct ucode_cpu_info ucode_cpu_info[NR_CPUS];


So when a CPU goes offline and comes back online, I don't see why the kernel
should not reuse the microcode that it already has. Anyhow the microcode will
not change. The same microcode would be requested from userspace again if the
kernel has freed its copy.

So what I feel is, earlier, the kernel used to invalidate the microcode for the CPU_DEAD
notification may be just to free the kernel's copy of the microcode as a memory
optimization (thinking that the microcode is not needed any more in kernel memory,
atleast for now).

This is my understanding. Please enlighten me if I am wrong.

--
Regards,
Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx>
Linux Technology Center,
IBM India Systems and Technology Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/