BUT... here is what I need to understand, the filesize has a drastic
effect on performance. If I am doing random reads from a 20GB file
(system only has 2GB ram, so caching is not a factor), I get
performance about where I want it to be: about 5.7 - 6ms per block. But
if that file is 2TB then the time almost doubles, to 11ms. Why is this?
No other factors changed, only the filesize.
I am assuming that somewhere along the way, the kernel now has to do an
additional read from the disk for some metadata for xfs... perhaps the
btree for the file doesn't fit in the kernel's memory? so it actually
needs to do 2 seeks, one to find out where to go on disk then one to
get the data. Is that the case? If so, how can I remedy this? How can I
tell the kernel to keep all of the files xfs data in memory?