答复: [PATCH] fs/resctrl: fix domid loss precision issue

From: Rex Nie
Date: Mon Mar 11 2024 - 05:38:46 EST


Hello,
Please kindly check my inline reply. Thanks.
Best regards
Rex Nie

> -----邮件原件-----
> 发件人: Maciej Wieczor-Retman <maciej.wieczor-retman@xxxxxxxxx>
> 发送时间: 2024年3月11日 16:24
> 收件人: Rex Nie <rex.nie@xxxxxxxxxxxxxxx>
> 抄送: james.morse@xxxxxxx; fenghua.yu@xxxxxxxxx;
> reinette.chatre@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Liming Wu
> <liming.wu@xxxxxxxxxx>
> 主题: Re: [PATCH] fs/resctrl: fix domid loss precision issue
>
> External Mail: This email originated from OUTSIDE of the organization!
> Do not click links, open attachments or provide ANY information unless you
> recognize the sender and know the content is safe.
>
>
> Hello,
>
> On 2024-03-11 at 14:48:22 +0800, Rex Nie wrote:
> >Below statement from mkdir_mondata_subdir function will loss precision,
> >because it assigns int to 14 bits bitfield.
> > priv.u.domid = d->id;
> >
> >This will cause below issue if cache_id > 0x3fff likes:
>
> Is there some reason for cache_id ever being this high?
>
> I thought the max for cache_id was the amount of L3 caches on a system. And I
> only observed it going up to 3 on some server platforms. So not nearly in the
> range of 0x3fff or 16k.
>
It is exactly as you said on X86 platforms, but cache_Id on Arm platform is different.
According to ACPI for mpam, cache id is used as locator for cache MSC. Reference to RD_PPTT_CACHE_ID definition from edk2-platforms:
#define RD_PPTT_CACHE_ID(PackageId, ClusterId, CoreId, CacheType) \
( \
(((PackageId) & 0xF) << 20) | (((ClusterId) & 0xFF) << 12) | \
(((CoreId) & 0xFF) << 4) | ((CacheType) & 0xF) \
)
So it may be > 0x3fff on Arm platform.

Reference RD_PPTT_CACHE_ID from edk2-platforms: https://github.com/tianocore/edk2-platforms/blob/master/Platform/ARM/SgiPkg/Include/SgiAcpiHeader.h#L202

> >/sys/fs/resctrl/mon_groups/p1/mon_data/mon_L3_1048564 # cat
> >llc_occupancy
>
> How did you get this file to appear? Could you maybe show how your
> mon_data directory looks like?
>
I found this issue on Arm FVP N1 platform and my N2 platform.

Below is the steps on Arm FVP N1:
mount -t resctrl resctrl / /sys/fs/resctrl
cd /sys/fs/resctrl/mon_data

/sys/fs/resctrl/mon_data # ls -l
total 0
dr-xr-xr-x 2 0 0 0 Mar 11 09:24 mon_L3_1048564

cd /sys/fs/resctrl/mon_data # cd mon_L3_1048564
/sys/fs/resctrl/mon_data/mon_L3_1048564 # cat llc_occupancy
cat: read error: No such file or directory

Arm FVP MPAM: https://neoverse-reference-design.docs.arm.com/en/latest/mpam/mpam-resctrl.html#memory-system-resource-partitioning-and-monitoring-mpam

> >cat: read error: No such file or directory
> >
> >This is the call trace when cat llc_occupancy:
> >rdtgroup_mondata_show()
> > domid = md.u.domid
> > d = resctrl_arch_find_domain(r, domid)
> >
> >d is null here because of lossing precision
> >
> >Signed-off-by: Rex Nie <rex.nie@xxxxxxxxxxxxxxx>
> >Signed-off-by: Liming Wu <liming.wu@xxxxxxxxxx>
> >---
> > fs/resctrl/internal.h | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h index
> >7a6f46b4edd0..096317610949 100644
> >--- a/fs/resctrl/internal.h
> >+++ b/fs/resctrl/internal.h
> >@@ -94,7 +94,7 @@ union mon_data_bits {
> > struct {
> > unsigned int rid : 10;
> > enum resctrl_event_id evtid : 8;
> >- unsigned int domid : 14;
> >+ u32 domid;
> > } u;
> > };
> >
> >--
> >2.34.1
> >
>
> --
> Kind regards
> Maciej Wieczór-Retman