Re: [PATCH 2/2] selftests/hmm-tests: Add test for dirty bits

From: Mika Penttilä
Date: Mon Aug 15 2022 - 00:06:39 EST




On 15.8.2022 7.05, Mika Penttilä wrote:


On 15.8.2022 6.21, Alistair Popple wrote:

Mika Penttilä <mpenttil@xxxxxxxxxx> writes:

On 15.8.2022 5.35, Alistair Popple wrote:
Mika Penttilä <mpenttil@xxxxxxxxxx> writes:

Hi Alistair!

On 12.8.2022 8.22, Alistair Popple wrote:
[...]

+    buffer->ptr = mmap(NULL, size,
+               PROT_READ | PROT_WRITE,
+               MAP_PRIVATE | MAP_ANONYMOUS,
+               buffer->fd, 0);
+    ASSERT_NE(buffer->ptr, MAP_FAILED);
+
+    /* Initialize buffer in system memory. */
+    for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
+        ptr[i] = 0;
+
+    ASSERT_FALSE(write_cgroup_param(cgroup, "memory.reclaim", 1UL<<30));
+
+    /* Fault pages back in from swap as clean pages */
+    for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
+        tmp += ptr[i];
+
+    /* Dirty the pte */
+    for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
+        ptr[i] = i;
+

The anon pages are quite likely in memory at this point, and dirty in pte.
Why would the pte be dirty? I just confirmed using some modified pagemap
code that on my system at least this isn't the case.

+    /*
+     * Attempt to migrate memory to device, which should fail because
+     * hopefully some pages are backed by swap storage.
+     */
+    ASSERT_TRUE(hmm_migrate_sys_to_dev(self->fd, buffer, npages));

And pages marked dirty also now. But could you elaborate how and where the above
fails in more detail, couldn't immediately see it...
Not if you don't have patch 1 of this series applied. If the
trylock_page() in migrate_vma_collect_pmd() succeeds (which it almost
always does) it will have cleared the pte without setting PageDirty.


Ah yes but I meant with the patch 1 applied, the comment "Attempt to migrate
memory to device, which should fail because hopefully some pages are backed by
swap storage" indicates that hmm_migrate_sys_to_dev() would fail..and there's
that ASSERT_TRUE which means fail here.

So I understand the data loss but where is the hmm_migrate_sys_to_dev() failing,
with or wihtout patch 1 applied?

Oh right. hmm_migrate_sys_to_dev() will fail because the page is in the
swap cache, and migrate_vma_*() doesn't currently support migrating
pages with a mapping.


Ok I forgot we skip also page cache pages, not just file pages...

Meant we skip swap cache pages also, not just file pages..







So now we have a dirty page without PageDirty set and without a dirty
pte. If this page gets swapped back to disk and is still in the swap
cache data will be lost because reclaim will see a clean page and won't
write it out again.
At least that's my understanding - please let me know if you see
something that doesn't make sense.

+
+    ASSERT_FALSE(write_cgroup_param(cgroup, "memory.reclaim", 1UL<<30));
+
+    /* Check we still see the updated data after restoring from swap. */
+    for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i)
+        ASSERT_EQ(ptr[i], i);
+
+    hmm_buffer_free(buffer);
+    destroy_cgroup();
+}
+
    /*
     * Read anonymous memory multiple times.
     */


--Mika