Re: [PATCH] EDAC, ghes: use CPER module handles to locate DIMMs

From: Tyler Baicar
Date: Thu Aug 30 2018 - 12:46:14 EST


On Thu, Aug 30, 2018 at 12:32 PM, James Morse <james.morse@xxxxxxx> wrote:
> Hi Fan,
>
> On 30/08/18 15:40, wufan wrote:
>>>> @@ -327,12 +349,20 @@ void ghes_edac_report_mem_error(int sev,
>>> struct cper_sec_mem_err *mem_err)
>>>> p += sprintf(p, "bit_pos:%d ", mem_err->bit_pos);
>>>> if (mem_err->validation_bits &
>>> CPER_MEM_VALID_MODULE_HANDLE) {
>>>> const char *bank = NULL, *device = NULL;
>>>> + int index = -1;
>>>> +
>>>> dmi_memdev_name(mem_err->mem_dev_handle, &bank,
>>> &device);
>>>
>>>> + p += sprintf(p, "DIMM DMI handle: 0x%.4x ",
>>>> + mem_err->mem_dev_handle);
>>>> if (bank != NULL && device != NULL)
>>>> p += sprintf(p, "DIMM location:%s %s ", bank, device);
>>>> - else
>>>> - p += sprintf(p, "DIMM DMI handle: 0x%.4x ",
>>>> - mem_err->mem_dev_handle);
>>>
>>> Why do we now print the handle every time? The handle is pretty
>>> meaningless, it can only be used to find the location-strings, if we get those
>>> we print them instead.
>>
>> For ghes_edac the bank/device is informational, and nothing would go wrong
>> if the bank/device numbers are the same as another entry. But the handle
>> is now critical for DIMM lookup, thus pull it out.
>
> Is printing the handle to the kernel log critical?
>

I don't see why we would need this print. The bank/device
print is enough to map what is shown in dmesg to an SMBIOS
entry if that's really needed.

Thanks,
Tyler