Re: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled

From: Kalra, Ashish
Date: Fri Oct 14 2022 - 16:00:30 EST

Next message: Sean Christopherson: "Re: [PATCH v5 0/7] KVM: x86: Hyper-V invariant TSC control feature"
Previous message: Carsten Langer: "Re: Possible regression: unable to mount CIFS 1.0 shares from older machines since 76a3c92ec9e0668e4cd0e9ff1782eb68f61a179c"
In reply to: Borislav Petkov: "Re: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hello Boris,

On 10/13/2022 10:15 AM, Borislav Petkov wrote:

On Mon, Jun 20, 2022 at 11:05:01PM +0000, Ashish Kalra wrote:

+static void snp_leak_pages(unsigned long pfn, unsigned int npages)

That function name looks wrong.

+{
+ WARN(1, "psc failed, pfn 0x%lx pages %d (leaking)\n", pfn, npages);
+ while (npages--) {
+ memory_failure(pfn, 0);

^^^^^^^^^^^^^^^^^^^^^^

Why?

The page was in FW state and we couldn't transition it back to HV/Shared
state, any access to this page can cause RMP #PF.

* This function is called by the low level machine check code
* of an architecture when it detects hardware memory corruption
* of a page. It tries its best to recover, which includes
* dropping pages, killing processes etc.

I don't think you wanna do that.

It looks like you want to prevent the page from being used again but not
mark it as PG_hwpoison and whatnot. PG_reserved perhaps?

* PG_reserved is set for special pages. The "struct page" of such a
* page should in general not be touched (e.g. set dirty) except by its
* owner.

If it is "still" accessed/touched then it can cause RMP #PF.
On the other hand,

* PG_hwpoison... Accessing is
* not safe since it may cause another machine check. Don't touch!

That sounds exactly the state we want these page(s) to be in ?

Another possibility is PG_error.

+ dump_rmpentry(pfn);
+ pfn++;
+ }
+}
+
+static int snp_reclaim_pages(unsigned long pfn, unsigned int npages, bool locked)
+{
+ struct sev_data_snp_page_reclaim data;
+ int ret, err, i, n = 0;
+
+ for (i = 0; i < npages; i++) {
+ memset(&data, 0, sizeof(data));
+ data.paddr = pfn << PAGE_SHIFT;

Oh wow, this is just silly. A struct for a single u64. Just use a

u64 paddr;

Ok.

directly. But we had this topic already...

+
+ if (locked)

Ew, that's never a good design - conditional locking.

There is a previous suggestion to change `sev_cmd_mutex` to some sort of nesting lock type to clean up this if (locked) code, though AFAIK, there is no support for nesting lock types.

+ ret = __sev_do_cmd_locked(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
+ else
+ ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);

<---- newline here.

+ if (ret)
+ goto cleanup;
+
+ ret = rmp_make_shared(pfn, PG_LEVEL_4K);
+ if (ret)
+ goto cleanup;
+
+ pfn++;
+ n++;
+ }
+
+ return 0;
+
+cleanup:
+ /*
+ * If failed to reclaim the page then page is no longer safe to
+ * be released, leak it.
+ */
+ snp_leak_pages(pfn, npages - n);

So this looks real weird: we go and reclaim pages, we hit an error
during reclaiming a page X somewhere in-between and then we go and mark
the *remaining* pages as not to be used?!

Why?

Why not only that *one* page which failed and then we continue with the
rest?!

I agree and will change to this approach.

+ return ret;
+}
+
+static inline int rmp_make_firmware(unsigned long pfn, int level)
+{
+ return rmp_make_private(pfn, 0, level, 0, true);
+}

That's a silly wrapper used only once. Just do at the callsite:

/* Mark this page as belonging to firmware */
rc = rmp_make_private(pfn, 0, level, 0, true);

Ok.

+
+static int snp_set_rmp_state(unsigned long paddr, unsigned int npages, bool to_fw, bool locked,
+ bool need_reclaim)

Tangential to the above, this is just nuts with those bool arguments.
Just look at the callsites: do you understand what they do?

snp_set_rmp_state(paddr, npages, true, locked, false);

what does that do? You need to go up to the definition of the function,
count the arguments and see what that "true" arg stands for.

I totally agree, this is simply unreadable.

And this has been mentioned previously too ...
This function can do a lot and when I read the call sites its hard to
see what its doing since we have a combination of arguments which tell
us what behavior is happening ...

What you should do instead is, have separate helpers which do only one
thing:

rmp_mark_pages_firmware();
rmp_mark_pages_shared();
rmp_mark_pages_...

and then have the *callers* issue snp_reclaim_pages() when needed. So you'd have

rmp_mark_pages_firmware();
rmp_mark_pages_shared()

and __snp_free_firmware_pages() would do

rmp_mark_pages_shared();
snp_reclaim_pages();

Actually, this only needs to call snp_reclaim_pages().

and so on.

And then if you need locking, the callers can decide which sev_do_cmd
variant to issue.

And then if you have common code fragments which you can unify into a
bigger helper function, *then* you can do that.

Instead of multiplexing it this way. Which makes it really hard to
follow what the code does.

Sure i will do this cleanup.

+ unsigned long pfn = __sme_clr(paddr) >> PAGE_SHIFT; /* Cbit maybe set in the paddr */

No side comments pls.

+ int rc, n = 0, i;
+
+ for (i = 0; i < npages; i++) {
+ if (to_fw)
+ rc = rmp_make_firmware(pfn, PG_LEVEL_4K);
+ else
+ rc = need_reclaim ? snp_reclaim_pages(pfn, 1, locked) :
+ rmp_make_shared(pfn, PG_LEVEL_4K);
+ if (rc)
+ goto cleanup;
+
+ pfn++;
+ n++;
+ }
+
+ return 0;
+
+cleanup:
+ /* Try unrolling the firmware state changes */
+ if (to_fw) {
+ /*
+ * Reclaim the pages which were already changed to the
+ * firmware state.
+ */
+ snp_reclaim_pages(paddr >> PAGE_SHIFT, n, locked);
+
+ return rc;
+ }
+
+ /*
+ * If failed to change the page state to shared, then its not safe
+ * to release the page back to the system, leak it.
+ */
+ snp_leak_pages(pfn, npages - n);
+
+ return rc;
+}

...

+void snp_free_firmware_page(void *addr)
+{
+ if (!addr)
+ return;
+
+ __snp_free_firmware_pages(virt_to_page(addr), 0, false);
+}
+EXPORT_SYMBOL(snp_free_firmware_page);

EXPORT_SYMBOL_GPL() ofc.

Thanks,
Ashish

Next message: Sean Christopherson: "Re: [PATCH v5 0/7] KVM: x86: Hyper-V invariant TSC control feature"
Previous message: Carsten Langer: "Re: Possible regression: unable to mount CIFS 1.0 shares from older machines since 76a3c92ec9e0668e4cd0e9ff1782eb68f61a179c"
In reply to: Borislav Petkov: "Re: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]