Re: [PATCH 1/2] x86/MCE: Extend size of the MCE Records pool

From: Naik, Avadhut
Date: Fri Feb 09 2024 - 14:47:43 EST


Hi,

On 2/8/2024 12:39, Luck, Tony wrote:
>> Will change it to (2 * sizeof(struct mce)) though. Feels more
>> accurate. Thanks for the suggestion!
>
> Thanks.
>
>> Do you have any additional concerns/comments on this patchset?
>
> Overall this is an excellent addition. Reserved space to log errors does need to scale
> up with the CPU count.
>
> I think part 1 (unconditional increase based on CPU count) is a "must have" enhancement.
> With the change to CPU_GEN_MEMSZ #define:
>
> Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx>
>
>
> I'm less enthusiastic about part 2 adding a command line option to override the code in
> part 1 with a bigger (or smaller?) amount. Can you describe some situation where a user
> would need to use this?
>
I added the command-line option to ensure that we have covered all bases and are not enforcing
this memory footprint on all users.

A system with 512 logical CPUs, by the current proposed logic, will have 32 pages allocated
for the pool ((512*256)/4096)). Some users may feel that this is not needed on their systems
and they can do with just, maybe, 16 pages. The command line option gives them the flexibility
to do so without having to change kernel code, rebuild and deploy.

Conversely, some users wanting to err on the side of caution, might feel that the above 32 pages
are not enough for the pool and may want to allocate more, maybe, 48 pages. The command line
option again, provides them with the flexibility to do so.

Sounds reasonable?

--
Thanks,
Avadhut Naik

> -Tony