Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU

From: Hidetoshi Seto
Date: Tue May 20 2014 - 23:36:26 EST


(2014/05/21 12:19), Chen Yucong wrote:
> On Wed, 2014-05-21 at 11:43 +0900, Hidetoshi Seto wrote:
>> (2014/05/21 11:03), Chen Yucong wrote:
>>> On Wed, 2014-05-21 at 10:40 +0900, Hidetoshi Seto wrote:
>>>> (2014/05/20 11:11), Chen Yucong wrote:
>>>>> mces_seen is a Per-CPU variable which should only be accessed by Per-CPU as possible. So the
>>>>> clear operation of mces_seen should also be lcoal to Per-CPU rather than monarch CPU.
>>>>
>>>> I don't think it should be local.
>>>> Originally what we want to have here is memory to save mces_seen for each online cpus,
>>>> such as a global array like mces_seen[cpus]. But at same time we don't want to preallocate
>>>> big array enough for max possible cpus. So we use per-cpu store instead.
>>>>
>>> But mces_seen will just be updated by Per-CPU rather than monarch CPU.
>>> It is only read by monarch CPU.
>>
>> Because mce status registers are per-cpu and monarch cannot access subjects' registers
>> directly,
> Right. This is one reason why we need to distribute the clear operation
> to Per-CPU. And in fact it exactly assigns per-cpu property to
> mces_seen.
>
>> all subjects read it's status for monarch, store the status to memory for monarch,
>> and then monarch gather all status to make decision for all.
>
> mce_regin, which is only called by monarch CPU, can be used for system
> panics as quickly as possible if there is a truly data corrupting error.
> But Monarch CPU don't have to help all other CPU to clean mces_clean.
> One advantage of Per-CPU is the isolation of errors propagation, being
> so, why do not we clean mces_seen by Per-CPU?

What kind of error propagations are you expecting/concerning here?
Could you explain the problem more in detail?


Thanks,
H.Seto

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/