Re: [GIT PULL] x86/shstk for 6.4

From: Dave Hansen
Date: Fri May 12 2023 - 13:34:53 EST


On 5/8/23 16:47, Linus Torvalds wrote:
> - we would probably be *much* better off with a "if (mm->count == 1)"
> test that goes off and does *not* do the atomic case (which also
> doesn't have any worries about dirty bits). I'll take a well-predicted
> conditional branch over an atomic op any day of the week

Were you really thinking of mm->count==1, or did you mean
mm->mm_users==1? I _think_ the only clue that things like ptrace() and
kthread_use_mm() are poking around in the page tables is when
mm->mm_users>1. They don't bump mm->count.

Most mmget() (or get_task_mm()->atomic_inc(&mm->mm_users)) callers are
obviously harmless for our purposes, like proc_pid_statm().

There are others like idxd_copy_cr()->copy_to_user() that are less
obviously OK. They're OK if they fault during a fork() because the
fault will end up stuck on mmap_read(mm) waiting for fork() to release
its write.

But the worry is if those kthread_use_mm() users *don't* fault:

CPU-1 CPU-2
fork()
// mm->mm_users==1
ptep_set_wrprotect()
^ non-atomic
kthread_use_mm()
// mm->mm_users==2
copy_to_user()
// page walker sets Dirty=1

There's always a race there because mm->mm_users can always get bumped
after the fork()er checks it.

Is there something that fixes this race that I'm missing?

We can probably do something more comprehensive to make sure that
mm->mm_users isn't bumped during fork(), but it'll be a bit more
complicated than just checking mm->mm_users in fork().