Re: [PATCH] oprofile per-cpu buffer overrun

From: Anton Blanchard
Date: Mon Jan 26 2004 - 01:00:27 EST



> It helps if the buffer size is a power of two, of course, but integer
> modulus is pretty quick.

If its something thats hit real hard, the way the modulus is done can
make a difference. We've seen it in various places (eg disabling tigon 1
support in the acenic changes the driver from using a variable to a
compile time constant and you can actually see it in the profile):

quickest == power of 2 compile time constant (results in a mask)
quickish == compile time constant (results in the multiplication by
an inverse trick)
slow == run time (results in a divide)

Of course you can do like the printk stuff and use a variable but
enforce a power of 2 value and & it.

Anton

--

unsigned long i, j, k;

int quickest()
{
j = i % 2;
}

int quickish()
{
j = i % 3;
}

int slow()
{
j = i % k;
}

--

quickest:
lis 9,i@ha
lwz 0,i@l(9)
lis 9,j@ha
rlwinm 0,0,0,31,31
stw 0,j@l(9)
blr
quickish:
lis 9,i@ha
lis 0,0xaaaa
lwz 11,i@l(9)
ori 0,0,43691
lis 9,j@ha
mulhwu 0,11,0
srwi 0,0,1
mulli 0,0,3
subf 11,0,11
stw 11,j@l(9)
blr
slow:
lis 9,i@ha
lis 11,k@ha
lwz 10,i@l(9)
lwz 9,k@l(11)
divwu 0,10,9
mullw 0,0,9
lis 9,j@ha
subf 10,0,10
stw 10,j@l(9)
blr
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/