Block IO: more io-cpu-affinity results

From: Alan D. Brunelle
Date: Tue Apr 15 2008 - 08:47:47 EST


On a 4-way IA64 box we are seeing definite improvements in overall
system responsiveness w/ the patch series currently in Jens'
io-cpu-affinity branch on his block IO git repository. In this
microbenchmark, I peg 4 processes to 4 separate processors: 2 are doing
CPU-intensive work (sqrts) and 2 are doing IO-intensive work (4KB direct
reads from RAID array cache - thus limiting physical disk accesses).

There are 2 variables: whether rq_affinity is on or off for the devices
under test for the IO-intensive procs, and whether the IO-intensive
procs are pegged onto the same CPU as is handling IRQs for its device.
The results are averaged over 4-minute runs per permutation.

When the IO-intensive procs are pegged onto the CPU that is handling
IRQs for its device, we see no real difference between rq_affinity on or
off:

rq=0 local=1 66.616 (M sqrt/sec) 12.312 (K ios/sec)
rq=1 local=1 66.616 (M sqrt/sec) 12.314 (K ios/sec)

Both see 66.616 million sqrts per second, and 12,300 IOs per second.

However, when we move the 2 IO-intensive threads onto CPUs that are not
handling its IRQs, we see a definite improvement - both in terms of the
amount of CPU-intensive work we can do (about 4%), as well as the number
of IOs per second achieved (about 1%):

rq=0 local=0 61.929 (M sqrt/sec) 11.911 (K ios/sec)
rq=1 local=0 64.386 (M sqrt/sec) 12.026 (K ios/sec)

Alan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/