Re: [PATCH v2 1/1] fs/splice: add missing callback for inaccessible pages

From: Dave Hansen
Date: Thu Apr 30 2020 - 18:07:26 EST


I was also wondering if Claudio was right about the debug patch having
races. I went to go look how the s390 code avoids races when pages go
from accessible->inaccessible.

Because, if if all of the traps are in place to transform pages from
inaccessible->accessible, the code *after* those traps is still
vulnerable. What *keeps* pages accessible?

The race avoidance is this, basically:

down_read(&gmap->mm->mmap_sem);
lock_page(page);
ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
...
> expected = expected_page_refs(page);
> if (!page_ref_freeze(page, expected))
> return -EBUSY;
> set_bit(PG_arch_1, &page->flags);
> rc = uv_call(0, (u64)uvcb);
> page_ref_unfreeze(page, expected);

... up_read(mmap_sem) / unlock_page() / unlock pte

I'm assuming that after the uv_call(), the page is inaccessible and I/O
devices will go boom if they touch the page.

The page_ref_freeze() ensures that references come between the
freeze/unfreeze are noticed, but it doesn't actually *stop* new ones for
users that hold references already. For the page cache, especially,
someone could do:

page = find_get_page();
arch_make_page_accessible();
lock_page();
... make_secure_pte();
unlock_page();
get_page();
// ^ OK because I have a ref
// do DMA on inaccessible page

Because the make_secure_pte() code isn't looking for a *specific*
'expected' value, it has no way of noticing that the extra ref snuck in
there.

I _think_ expected actually needs to be checked for having a specific
(low) value so that if there's a *possibility* of a reference holder
acquiring additional references, the page is known to be off-limits.
mm/migrate.c has a few examples of this, but I'm not quite sure how
bulletproof they are. Some of it appears to just be optimizations.