Re: [PATCH] futex: Check for uaddr alignment as early as possible

From: Ingo Molnar
Date: Tue Dec 12 2017 - 05:23:22 EST



* Darren Hart <dvhart@xxxxxxxxxxxxx> wrote:

> From: "Darren Hart (VMware)" <dvhart@xxxxxxxxxxxxx>
>
> uaddr alignment is currently tested by get_futex_key(). We can catch
> misalignment earlier in sys_futex and return -EINVAL sooner. This
> simplifies get_futex_key() a little, but more importantly exits the
> kernel as soon as an invalid parameter is detected.
>
> Passes all selftests/futex testcases on a dual socket Xeon E5-2670, 16
> physical cores total, 32 threads total.
>
> Signed-off-by: Darren Hart (VMware) <dvhart@xxxxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Darren Hart <dvhart@xxxxxxxxxxxxx>
> ---
> kernel/futex.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/futex.c b/kernel/futex.c
> index 76ed592..c3ee6c4 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -509,8 +509,6 @@ get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key, int rw)
> * The futex address must be "naturally" aligned.
> */
> key->both.offset = address % PAGE_SIZE;
> - if (unlikely((address % sizeof(u32)) != 0))
> - return -EINVAL;
> address -= key->both.offset;
>
> if (unlikely(!access_ok(rw, uaddr, sizeof(u32))))
> @@ -3525,6 +3523,11 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
> u32 val2 = 0;
> int cmd = op & FUTEX_CMD_MASK;
>
> + /* Only allow for aligned uaddr variables */
> + if (unlikely((unsigned long)uaddr % sizeof(u32) != 0 ||
> + (unsigned long)uaddr2 % sizeof(u32) != 0))
> + return -EINVAL;
> +
> if (utime && (cmd == FUTEX_WAIT || cmd == FUTEX_LOCK_PI ||
> cmd == FUTEX_WAIT_BITSET ||
> cmd == FUTEX_WAIT_REQUEUE_PI)) {

Yeah, so I applied this yesterday, then -tip started regressing sporadically
during distro networking bring-up, and it took me half a day of debugging to
statistically bisect it back to this patch :-/

So it's apparently broken, but I don't see yet how.

Thanks,

Ingo