Re: [PATCH] x86/mm: Simplify redundant overlap calculation

From: Dave Hansen
Date: Tue Jan 23 2024 - 12:15:49 EST


On 1/23/24 08:54, David Binderman wrote:
>> Remove the second condition.  It is exactly the same as the first.
> I don't think the first condition is sufficient. I suspect something like
>
>        return (r2_start <= r1_start && r1_start <= r2_end) ||
>                (r2_start <= r1_end && r1_end <= r2_end);
>
> Given the range [r2_start .. r2_end], then if r1_start or r1_end
> are in that range, you have overlap.
>
> Unless you know different.

First of all, I've gotten these bounds checks wrong in code more times
than I can count. I have zero trust that I'll get them right. :)

But the compiler seems to know different at least:

int overlaps1(unsigned long r1_start, unsigned long r1_end,
unsigned long r2_start, unsigned long r2_end)
{
return (r1_start <= r2_end && r1_end >= r2_start) ||
(r2_start <= r1_end && r2_end >= r1_start);
}

int overlaps2(unsigned long r1_start, unsigned long r1_end,
unsigned long r2_start, unsigned long r2_end)
{
return (r1_start <= r2_end && r1_end >= r2_start);
}

Results in:

0000000000001180 <overlaps1>:
1180: f3 0f 1e fa endbr64
1184: 48 39 cf cmp %rcx,%rdi
1187: 49 89 d0 mov %rdx,%r8
118a: 0f 96 c2 setbe %dl
118d: 31 c0 xor %eax,%eax
118f: 4c 39 c6 cmp %r8,%rsi
1192: 0f 93 c0 setae %al
1195: 21 d0 and %edx,%eax
1197: c3 ret
1198: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
119f: 00

00000000000011a0 <overlaps2>:
11a0: f3 0f 1e fa endbr64
11a4: 48 39 cf cmp %rcx,%rdi
11a7: 49 89 d0 mov %rdx,%r8
11aa: 0f 96 c2 setbe %dl
11ad: 31 c0 xor %eax,%eax
11af: 4c 39 c6 cmp %r8,%rsi
11b2: 0f 93 c0 setae %al
11b5: 21 d0 and %edx,%eax
11b7: c3 ret
11b8: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
11bf: 00

I also wrote a quick program to throw random numbers into both versions
and see if they differ. They never did, which they obviously can't if
they're the exact same instructions.