Re: Strange IDE performance change in 2.6.1-rc1 (again)

From: Ed Sweetman
Date: Sat Jan 10 2004 - 11:03:56 EST


Paolo Ornati wrote:
On Friday 09 January 2004 20:15, Ram Pai wrote:

On Thu, 2004-01-08 at 17:17, Andrew Morton wrote:

Ram Pai <linuxram@xxxxxxxxxx> wrote:

Well this is my theory, somebody should validate it,

One megabyte seems like far too litte memory to be triggering the
effect which you describe. But yes, the risk is certainly there.

You could verify this with:

I cannot exactly reproduce what Pualo Ornati is seeing.

Pualo: Request you to validate the following,

1) see whether you see a regression with files replacing the
cat command in your script with
dd if=big_file of=/dev/null bs=1M count=256

2) and if you do, check if you see a bunch of 'eek' with Andrew's
following patch. (NOTE: without reverting the changes
in filemap.c)

-------------------------------------------------------------------------
-

--- 25/mm/filemap.c~a Thu Jan 8 17:15:57 2004
+++ 25-akpm/mm/filemap.c Thu Jan 8 17:16:06 2004
@@ -629,8 +629,10 @@ find_page:
handle_ra_miss(mapping, ra, index);
goto no_cached_page;
}
- if (!PageUptodate(page))
+ if (!PageUptodate(page)) {
+ printk("eek!\n");
goto page_not_up_to_date;
+ }
page_ok:
/* If users can be writing to this page using arbitrary
* virtual addresses, take care about potential aliasing

-------------------------------------------------------------------------


Ok, this patch seems for -mm tree... I have applied it by hand (on a vanilla 2.6.1-rc1).

For my tests I've used this script:

#!/bin/sh

RA_VALS="256 128 64"
FILE="/big_file"
SIZE=`stat -c '%s' $FILE`
NR_TESTS="3"
LINUX=`uname -r`

echo "HD test for Penguin $LINUX"

killall5
sync
sleep 3

for ra in $RA_VALS; do
hdparm -a $ra /dev/hda
for i in `seq $NR_TESTS`; do
echo "_ _ _ _ _ _ _ _ _"
./fadvise $FILE 0 $SIZE dontneed
time dd if=$FILE of=/dev/null bs=1M count=256
done
echo "________________________________"
done


RESULTS (2.6.0 / 2.6.1-rc1)

HD test for Penguin 2.6.0

/dev/hda:
setting fs readahead to 256
readahead = 256 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.427s
user 0m0.002s
sys 0m1.722s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.963s
user 0m0.000s
sys 0m1.760s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.291s
user 0m0.001s
sys 0m1.713s
________________________________

/dev/hda:
setting fs readahead to 128
readahead = 128 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m9.910s
user 0m0.003s
sys 0m1.882s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m9.693s
user 0m0.003s
sys 0m1.860s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m9.733s
user 0m0.004s
sys 0m1.922s
________________________________

/dev/hda:
setting fs readahead to 64
readahead = 64 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m9.107s
user 0m0.000s
sys 0m2.026s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m9.227s
user 0m0.004s
sys 0m1.984s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m9.152s
user 0m0.002s
sys 0m2.013s
________________________________


HD test for Penguin 2.6.1-rc1

/dev/hda:
setting fs readahead to 256
readahead = 256 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.984s
user 0m0.002s
sys 0m1.751s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.704s
user 0m0.002s
sys 0m1.766s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.886s
user 0m0.002s
sys 0m1.731s
________________________________

/dev/hda:
setting fs readahead to 128
readahead = 128 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.120s
user 0m0.001s
sys 0m1.830s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.596s
user 0m0.005s
sys 0m1.764s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.481s
user 0m0.002s
sys 0m1.727s
________________________________

/dev/hda:
setting fs readahead to 64
readahead = 64 (on)
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.361s
user 0m0.006s
sys 0m1.782s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.655s
user 0m0.002s
sys 0m1.778s
_ _ _ _ _ _ _ _ _
256+0 records in
256+0 records out

real 0m11.369s
user 0m0.004s
sys 0m1.798s
________________________________


As you can see 2.6.0 performances increase setting readahead from 256 to 64 (64 seems to be the best value) while 2.6.1-rc1 performances don't change too much.

I noticed that on 2.6.0 with readahead setted at 256 the HD LED blinks during the data transfer while with lower values (128 / 64) it stays on.
Instead on 2.6.1-rc1 HD LED blinks with almost any values (I must set it at 8 to see it stable on).

ANSWERS:

1) YES... I see a regression with files ;-(

2) YES, I see also a bunch of "eek!" (a mountain of "eek!")

Bye



I'm using 2.6.0-mm1 and i see no difference from setting readahead to anything on my extent enabled partitions. So it appears that filesystem plays a big part in your numbers here, not just hdd attributes or settings.

The partition FILE is on is an ext3 + extents enabled partition. Despite not having fadvise (what is this anyway?) the numbers are all real and no error occured. Extents totally rocks for this type of data access, as you can see below.

Stick to non-fs tests if you want to benchmark fs independent code. Not everyone is going to be able to come up with the same results as you and as such a possible fix could actually be detrimental, and we'd be stuck in a loop of "ide regression" mails.







HD test for Penguin 2.6.0-mm1-extents

/dev/hda:
setting fs readahead to 8192
readahead = 8192 (on)
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.098793 seconds (244300323 bytes/sec)

real 0m1.100s
user 0m0.005s
sys 0m1.096s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.102250 seconds (243534086 bytes/sec)

real 0m1.104s
user 0m0.000s
sys 0m1.104s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.096914 seconds (244718759 bytes/sec)

real 0m1.098s
user 0m0.001s
sys 0m1.097s
________________________________

/dev/hda:
setting fs readahead to 256
readahead = 256 (on)
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.104646 seconds (243005877 bytes/sec)

real 0m1.106s
user 0m0.001s
sys 0m1.105s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.100904 seconds (243831834 bytes/sec)

real 0m1.102s
user 0m0.000s
sys 0m1.103s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.102060 seconds (243576076 bytes/sec)

real 0m1.104s
user 0m0.002s
sys 0m1.101s
________________________________

/dev/hda:
setting fs readahead to 128
readahead = 128 (on)
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.100799 seconds (243855121 bytes/sec)

real 0m1.102s
user 0m0.000s
sys 0m1.102s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.101516 seconds (243696385 bytes/sec)

real 0m1.103s
user 0m0.002s
sys 0m1.101s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.100963 seconds (243818758 bytes/sec)

real 0m1.102s
user 0m0.000s
sys 0m1.103s
________________________________

/dev/hda:
setting fs readahead to 64
readahead = 64 (on)
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.104634 seconds (243008498 bytes/sec)

real 0m1.106s
user 0m0.002s
sys 0m1.105s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.102107 seconds (243565703 bytes/sec)

real 0m1.104s
user 0m0.003s
sys 0m1.100s
_ _ _ _ _ _ _ _ _
/tester: line 18: ./fadvise: No such file or directory
256+0 records in
256+0 records out
268435456 bytes transferred in 1.104429 seconds (243053595 bytes/sec)

real 0m1.106s
user 0m0.000s
sys 0m1.106s
________________________________