Poor performance on cp on kernel 2.6.27.7-9-xen (openSUSE 11.1)

From: Michael Monnerie
Date: Wed Feb 04 2009 - 20:34:40 EST


Dear list, (I'm no subscriber so please CC: me, thanks!)

I'm wondering about a problem that's not entirely clear to me. I have a
machine (hardware specs at the end) that basically has two test disks,
sdb and sdd. There's nobody logged in, no network activity (apart from
my ssh session to have a console), no other disk activity. The two disks
have empty XFS filesystems, and I create a big file with
dd if=/dev/zero of=/disk1/bigfile bs=1024k count=10000
which is quite fast:
10485760000 Bytes (10 GB) kopiert, 15,2056 s, 690 MB/s

Then I do "cp -a --sparse=never /disk1/bigfile /disk2/" and that's what
"iostat -kx 5 555" says (just a part of course):
avg-cpu: %user %nice %system %iowait %steal %idle
0,02 0,00 10,31 5,36 0,00 84,57

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-
sz avgqu-sz await svctm %util
sdb 0,00 0,00 1268,40 0,00 162355,20 0,00
256,00 1,12 0,87 0,52 65,68
sdd 0,00 0,80 0,00 913,80 0,00 162442,50
355,53 1,59 1,74 0,16 15,04

And "top" says:
top - 06:36:43 up 9:30, 6 users, load average: 1.65, 1.71, 1.84
Tasks: 209 total, 2 running, 207 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 9.3%sy, 0.0%ni, 84.7%id, 5.6%wa, 0.2%hi, 0.2%si,
0.0%st
Mem: 16504708k total, 16455832k used, 48876k free, 768k buffers
Swap: 0k total, 0k used, 0k free, 16133320k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8534 root 20 0 14316 1284 744 R 48 0.0 3:09.19 cp
59 root 15 -5 0 0 0 S 12 0.0 12:50.49 kswapd1
58 root 15 -5 0 0 0 S 11 0.0 12:26.94 kswapd0
8352 root 20 0 0 0 0 S 7 0.0 1:20.01 pdflush
2360 root 15 -5 0 0 0 S 2 0.0 4:56.28 xfsdatad/7
18 root 15 -5 0 0 0 S 1 0.0 0:03.68 ksoftirqd/7
5709 root 20 0 53544 3184 556 S 1 0.0 0:37.80 archttp64
8572 root 20 0 16940 1360 940 R 1 0.0 0:00.34 top
1 root 20 0 1064 388 324 S 0 0.0 0:02.96 init

Now why doesn't Linux have 100 "%util" reported by iostat when I copy
from one disk to the other? It's a large file, no defragmentation, so it
should be fast as hell. Well, that depends of course, but at least it
should have 100% util at least on one disk. Because if there's only 80%
utilization, there are 20% left to use. Can somebody explain me why
Linux chooses to go on holidays instead of doing it's work? Is it a
member of labour?

I got a similar problem when copying between two XEN block devices:
xm block-attach 0 tap:aio:/disk1/test1.xen xvda w
xm block-attach 0 tap:aio:/disk1/test2.xen xvdb w
mount /dev/xvda1 /1
mount /dev/xvdb1 /2
cp -a --sparse=never /1/. /2/
(where /1 contains a root filesystem of an installed machine, about
1,5GB data, /2 is fresh created, /1 is reiserfs and /2 is XFS)

And one last thing:
rsync -aPv /disk1/bigfile /disk2/xxx
That copies at max. 50MB/s, because 2 rsync tasks are started each
taking 50% of 1 CPU - why doesn't the kernel switch the 2nd task to
another CPU? There are 8 cpus in that system - 7 of them idle.

Machine data:
2x quad core AMD Opteron 2350 (2GHz x 8)
sdb: 8 disks in RAID-50
sdd: 4 disks in RAID-5
disks are not shared between sdb and sdd, there are 16 disks in this
system.
Areca RAID 1680 16port SAS with 2GB Cache in writeback mode.

mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4

Attachment: signature.asc
Description: This is a digitally signed message part.