Re: [RFC PATCH] x86/fpu/xstate: Add more diagnostic information on inconsistent xstate sizes

From: Chang S. Bae
Date: Tue Apr 11 2023 - 12:29:56 EST


On 4/10/2023 1:43 PM, Fenghua Yu wrote:
On 4/7/23 11:22, Chang S. Bae wrote:
On 4/5/2023 11:39 AM, Fenghua Yu wrote:

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 0bab497c9436..5f27fcdc6c90 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -602,8 +602,37 @@ static bool __init paranoid_xstate_size_valid(unsigned int kernel_size)
          }
      }
      size = xstate_calculate_size(fpu_kernel_cfg.max_features, compacted);
-    XSTATE_WARN_ON(size != kernel_size,
-               "size %u != kernel_size %u\n", size, kernel_size);
+    if (size != kernel_size) {
+        u64 xcr0, ia32_xss;
+
+        XSTATE_WARN_ON(1, "size %u != kernel_size %u\n",
+                   size, kernel_size);
+
+        /* Show more information to help diagnose the size issue. */
+        pr_info("x86/fpu: max_features=0x%llx\n",
+            fpu_kernel_cfg.max_features);
+        print_xstate_offset_size();
+        pr_info("x86/fpu: total size: %u bytes\n", size);
+        xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
+        if (compacted) {
+            rdmsrl(MSR_IA32_XSS, ia32_xss);

This shouldn't be directly read here because of the LBR state component.

See the function comment:

  * Independent XSAVE features allocate their own buffers and are not
  * covered by these checks. Only the size of the buffer for task->fpu
  * is checked here.

But, isn't that max_features bitmask pretty much about it?

How about getting IA32_XSS from xfeatures_mask_supervisor()? That's how to get kernel_size by setting IA32_XSS without independent features in get_xsave_compacted_size()
I think what it tests here is comparing the sizes between the kernel code and microcode calculations on the same input, which is the max_features bitmask.

We know that the kernel code calculates the size based on it and also takes it to write down there -- XCR0 and IA32_XSS. Then, showing that bitmask looks to be enough I thought, no?

I still expect some acknowledgment of what is coded here for the kernel calculation details.

Thanks,
Chang