RE: Domain faults when CONFIG_CPU_SW_DOMAIN_PAN is enabled

From: Peter Rosin
Date: Thu Dec 03 2015 - 16:39:33 EST


Russell King wrote:
> On Thu, Dec 03, 2015 at 04:41:18PM +0000, Russell King - ARM Linux wrote:
> > On Thu, Dec 03, 2015 at 04:12:06PM +0000, Peter Rosin wrote:
> > > * uaccess_with_memcpy.c:__copy_to_user() has a mode in which it copies
> > > "non-atomically" (if faulthandler_disabled() returns 0). If a fault
> > > happens during __copy_to_user, what prevents some other thread from
> > > clobbering DACR?
> >
> > See the second point above. Moreover, if we sleep in down_read(),
> > then __switch_to() reads the current DACR value and saves it in the
> > thread information, and will restore that value when resuming the
> > thread - even if the thread has been migrated to a different CPU.
>
> I thought this was correct, but it isn't - that's what my original solution
> did, but I think when Will reviewed it, we decided it wasn't necessary -
> and it isn't necessary for every single case with the exception of this
> one. This is exactly what's going wrong: the down_read() in these paths
> calls into the scheduler, which switches away. When we come back, the
> DACR value is reset by the other thread to 0x51.
>
> There's a few ways to solve this:
>
> 1. Make the thread switching code save and restore the DACR register as
> it would do for domains. This imposes an overhead on every single
> context switch whether or not we happen to be in this _single_
> troublesome code. (Patch attached - as there's several, I'm attaching
> them.)
>
> 2. Add additional code to the uaccess-with-memcpy stuff to reset the
> DACR value prior to using memcpy() or memset(). (Patch attached.)

I took both patches for a quick spin (a dozen boots and one hour uptime
after that for each patch) and no incidents. I have not gathered data,
but the crash on boot feels like it's quite a bit above 50% when there
is a problem so this feels good (I used 5 clean reboots when I bisected
and that worked).

Reported-by: Peter Rosin <peda@xxxxxxxxxx>
Tested-by: Peter Rosin <peda@xxxxxxxxxx>

(and please don't forget to cc stable)

> 3. Make uaccess-with-memcpy depend on !CPU_SW_DOMAINS_PAN (suggested by
> Will)
>
> 4. Delete the uaccess-with-memcpy code (also suggested by Will.)
>
> I think the best thing I can do is say... "Discuss amongst yourselves" :)

I have no personal preference on what should be done, I only had
copy_to_user_memcpy enabled since that was what Atmel fed me.

Cheers,
Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/