Zach Brown wrote on Thursday, November 30, 2006 1:45 PMAt that time, a patch was written for raw device to demonstrate that
large performance head room is achievable (at ~20% speedup for micro-
benchmark and ~2% for db transaction processing benchmark) with a
tight I/O submission processing loop.
Where exactly does the benefit come from? icache misses? "atomic"
ops leading to pipeline flushes?
It benefit from shorter path length. It takes much shorter time to process
one I/O request, both in the submit and completion path. I always think in
terms of how many instructions, or clock ticks does it take to convert user
request into bio, submit it and in the return path, to process the bio call
back function and do the appropriate io completion (sync or async).