Re: [PATCH v8 9/9] sparc64: Add support for ADI (Application Data Integrity)

From: Khalid Aziz
Date: Fri Oct 13 2017 - 12:20:20 EST


On 10/13/2017 08:14 AM, Khalid Aziz wrote:
On 10/12/2017 02:27 PM, Anthony Yznaga wrote:

On Oct 12, 2017, at 7:44 AM, Khalid Aziz <khalid.aziz@xxxxxxxxxx> wrote:


On 10/06/2017 04:12 PM, Anthony Yznaga wrote:
On Sep 25, 2017, at 9:49 AM, Khalid Aziz <khalid.aziz@xxxxxxxxxx> wrote:

This patch extends mprotect to enable ADI (TSTATE.mcde), enable/disable
MCD (Memory Corruption Detection) on selected memory ranges, enable
TTE.mcd in PTEs, return ADI parameters to userspace and save/restore ADI
version tags on page swap out/in or migration. ADI is not enabled by
I still don't believe migration is properly supported. Your
implementation is relying on a fault happening on a page while its
migration is in progress so that do_swap_page() will be called, but
I don't see how do_swap_page() will be called if a fault does not
happen until after the migration has completed.

User pages are on LRU list and for the mapped pages on LRU list, migrate_pages() ultimately calls try_to_unmap_one and makes a migration swap entry for the page being migrated. This forces a page fault upon access on the destination node and the page is swapped back in from swap cache. The fault is forced by the migration swap entry, rather than fault being an accidental event. If page fault happens on the destination node while migration is in progress, do_swap_page() waits until migration is done. Please take a look at the code in __unmap_and_move().

I looked at the code again, and I now believe ADI tags are never restored for migrated pages. Here's why:


I will take a look at it again. I have run extensive tests migrating pages of a process across multiple NUMA nodes over and over again and ADI tags were never lost, so this does work. I won't rule out the possibility of having missed a code path where tags are not restored and I will look for it.

Anthony,

I just ran my migration test again which:

- malloc's 16 GB of memory
- Assigns a rotating ADI tag every 64 bytes to the malloc'd buffer
- Writes a pattern to the entire buffer
- Verifies the pattern it wrote using ADI tagged addresses.

While this test was running, I had a script migrate test program pages across two NUMA nodes every 30 seconds using migratepages command. I did not see an ADI tag mismatch over multiple runs of this test. This test shows migration is working.

Can you give me a test that shows the failure you think we should see and I will debug it.

Thanks,
Khalid