Re: Coverity: kfd_parse_subtype_cache(): Memory - corruptions

From: Felix Kuehling
Date: Fri Nov 04 2022 - 16:41:19 EST


On 2022-11-04 15:41, coverity-bot wrote:
Hello!

This is an experimental semi-automated report about issues detected by
Coverity from a scan of next-20221104 as part of the linux-next scan project:
https://scan.coverity.com/projects/linux-next-weekly-scan

You're getting this email because you were associated with the identified
lines of code (noted below) that were touched by commits:

Fri Dec 8 23:08:59 2017 -0500
3a87177eb141 ("drm/amdkfd: Add topology support for dGPUs")

Coverity reported the following:

*** CID 1527133: Memory - corruptions (OVERRUN)
drivers/gpu/drm/amd/amdkfd/kfd_crat.c:1113 in kfd_parse_subtype_cache()
1107 props->cache_size = cache->cache_size;
1108 props->cacheline_size = cache->cache_line_size;
1109 props->cachelines_per_tag = cache->lines_per_tag;
1110 props->cache_assoc = cache->associativity;
1111 props->cache_latency = cache->cache_latency;
1112
vvv CID 1527133: Memory - corruptions (OVERRUN)
vvv Overrunning array "cache->sibling_map" of 32 bytes by passing it to a function which accesses it at byte offset 63 using argument "64UL". [Note: The source code implementation of the function has been overridden by a builtin model.]
1113 memcpy(props->sibling_map, cache->sibling_map,
1114 sizeof(props->sibling_map));
1115
1116 /* set the sibling_map_size as 32 for CRAT from ACPI */
1117 props->sibling_map_size = CRAT_SIBLINGMAP_SIZE;
1118

If this is a false positive, please let us know so we can mark it as
such, or teach the Coverity rules to be smarter. If not, please make
sure fixes get into linux-next. :) For patches fixing this, please
include these lines (but double-check the "Fixes" first):

Reported-by: coverity-bot <keescook+coverity-bot@xxxxxxxxxxxx>
Addresses-Coverity-ID: 1527133 ("Memory - corruptions")
Fixes: 3a87177eb141 ("drm/amdkfd: Add topology support for dGPUs")

I'm not sure why this suddenly appeared after 5 years, but the read
over-run looks legit:


I think this was introduced by a more recent patch that was in fact meant to fix an array overrun on HW that is outgrowing the CRAT sibling map size:

commit 0938fbeb6f53fc44bc9b19784dee28496e68ba0c
Author: Ma Jun <Jun.Ma2@xxxxxxx>
Date:   Wed Nov 2 15:53:26 2022 +0800

    drm/amdkfd: Fix the warning of array-index-out-of-bounds

    For some GPUs with more CUs, the original sibling_map[32]
    in struct crat_subtype_cache is not enough
    to save the cache information when create the VCRAT table,
    so skip filling the struct crat_subtype_cache info instead
    fill struct kfd_cache_properties directly to fix this problem.

    Signed-off-by: Ma Jun <Jun.Ma2@xxxxxxx>
    Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
    Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
I added Ma Jun to the email.

Regards,
  Felix



struct crat_subtype_cache {
...
uint8_t sibling_map[CRAT_SIBLINGMAP_SIZE];

#define CRAT_SIBLINGMAP_SIZE 32


struct kfd_cache_properties {
...
uint8_t sibling_map[CACHE_SIBLINGMAP_SIZE];

#define CACHE_SIBLINGMAP_SIZE 64

Thanks for your attention!