Re: [ofa-general] [PATCH 2.6.30] RDMA/cxgb3: Remove modulo math.

From: David Miller
Date: Wed Feb 11 2009 - 03:01:11 EST


From: Roland Dreier <rdreier@xxxxxxxxx>
Date: Tue, 10 Feb 2009 23:20:39 -0800

> > unsigned long page_size[4];
> >
> > int main(int argc)
> > {
> > unsigned long long x = argc;
> >
> > return x % (1UL << (12 + page_size[argc]));
> > }
> >
> > I get a call to __umoddi3:
>
> You're not testing the same thing. The original code was:
>
> wqe->recv.sgl[i].to = cpu_to_be64(((u32) wr->sg_list[i].addr) %
> (1UL << (12 + page_size[i])));
>
> and it's not that easy to see with all the parentheses, but the
> expression being done is (u32) % (unsigned long). So rather than
> unsigned long long in your program, you should have just done unsigned
> (u32 is unsigned int on all Linux architectures). In that case gcc does
> not generate a call to any library function in all the versions I have
> handy, although gcc 4.1 does do a div instead of an and. (And I don't
> think any 32-bit architectures require a library function for (unsigned)
> % (unsigned), so the code should be OK)
>
> Your example shows that gcc is missing a strength reduction opportunity
> in not handling (u64) % (unsigned long) on 32 bit architectures, but I
> guess it is a more difficult optimization to do, since gcc has to know
> that it can simply zero the top 32 bits.

Indeed, I get the divide if I use "unsigned int" for "x".

I still think you should make this change, as many systems out
there are getting the expensive divide.

main:
sethi %hi(page_size), %g1
or %g1, %lo(page_size), %g1
mov %o0, %g3
sll %o0, 2, %g4
ld [%g1+%g4], %g2
mov 1, %g1
add %g2, 12, %g2
sll %g1, %g2, %g1
wr %g0, %g0, %y
nop
nop
nop
udiv %o0, %g1, %o0
smul %o0, %g1, %o0
jmp %o7+8
sub %g3, %o0, %o0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/