Re: [SCHED] Totally WRONG prority calculation with specific test-case (since 2.6.10-bk12)

From: Con Kolivas
Date: Tue Dec 27 2005 - 18:26:10 EST


On Wednesday 28 December 2005 08:48, Paolo Ornati wrote:
> On Tue, 27 Dec 2005 19:09:18 +0100
>
> Paolo Ornati <ornati@xxxxxxxxxxxxx> wrote:
> > Hello,
> >
> > I've found an easy-to-reproduce-for-me test case that shows a totally
> > wrong priority calculation: basically a CPU-intensitive process gets
> > better priority than a disk-intensitive one (dd if=bigfile
> > of=/dev/null ...).
> >
> > Seems impossible, isn't it?
> >
> > ---- THE NUMBERS with 2.6.15-rc7 -----
> >
> > The test-case is the Xvid encoding of dvd-ripped track with transcode
> > (using "dvd::rip" interface). The copied-and-pasted command line is
> > this:
> >
> > mkdir -m 0775 -p '/home/paolo/tmp/test/tmp' &&
> > cd /home/paolo/tmp/test/tmp && dr_exec transcode -H 10 -a 2 -x vob,null
> > -i /home/paolo/tmp/test/vob/003 -w 1198,50 -b 128,0,0 -s 1.972
> > --a52_drc_off -f 25 -Y 52,8,52,8 -B 27,10,8 -R 1 -y xvid4,null
> > -o /dev/null --print_status 20 && echo DVDRIP_SUCCESS mkdir -m 0775 -p
> > '/home/paolo/tmp/test/tmp' && cd /home/paolo/tmp/test/tmp && dr_exec
> > transcode -H 10 -a 2 -x vob -i /home/paolo/tmp/test/vob/003 -w 1198,50
> > -b 128,0,0 -s 1.972 --a52_drc_off -f 25 -Y 52,8,52,8 -B 27,10,8 -R 2 -y
> > xvid4 -o /home/paolo/tmp/test/avi/003/test-003.avi --print_status 20 &&
> > echo DVDRIP_SUCCESS
> >
> >
> > Here there is a TOP snapshot while running it:
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 5721 paolo 16 0 115m 18m 2428 R 84.4 3.7 0:15.11 transcode
> > 5736 paolo 25 0 50352 4516 1912 R 8.4 0.9 0:01.53 tcdecode
> > 5725 paolo 15 0 115m 18m 2428 S 4.6 3.7 0:00.84 transcode
> > 5738 paolo 18 0 115m 18m 2428 S 0.8 3.7 0:00.15 transcode
> > 5734 paolo 25 0 20356 1140 920 S 0.6 0.2 0:00.12 tcdemux
> > 5731 paolo 25 0 47312 2540 1996 R 0.4 0.5 0:00.08 tcdecode
> > 5319 root 15 0 166m 16m 2584 S 0.2 3.2 0:25.06 X
> > 5444 paolo 16 0 87116 22m 15m R 0.2 4.6 0:04.05 konsole
> > 5716 paolo 16 0 10424 1160 876 R 0.2 0.2 0:00.06 top
> > 5735 paolo 25 0 22364 1436 932 S 0.2 0.3 0:00.01 tcextract
> >
> >
> > DD running alone:
> >
> > paolo@tux /mnt $ mount space/; time dd if=space/bigfile of=/dev/null
> > bs=1M count=128; umount space/ 128+0 records in
> > 128+0 records out
> >
> > real 0m4.052s
> > user 0m0.000s
> > sys 0m0.209s
> >
> > DD while transcoding:
> >
> > paolo@tux /mnt $ mount space/; time dd if=space/bigfile of=/dev/null
> > bs=1M count=128; umount space/ 128+0 records in
> > 128+0 records out
> >
> > real 0m26.121s
> > user 0m0.001s
> > sys 0m0.255s

Looking at your top output I see that transcode command generates 7 processes
all likely to be using cpu, and your DD slowdown is almost 7 times... I
assume it all goes away if you nice the transcode up by 3 or more.

> Hello Con and Ingo... I've found that the above problem goes away
> by reverting this:
>
> http://linux.bkbits.net:8080/linux-2.6/cset@41e054c6pwNQXzErMxvfh4IpLPXA5A?
>nav=index.html|src/|src/include|src/include/linux|related/include/linux/sche
>d.h
>
> --------------------------------------------------
>
> [PATCH] sched: remove_interactive_credit

The issue is that the scheduler interactivity estimator is a state machine and
can be fooled to some degree, and a cpu intensive task that just happens to
sleep a little bit gets significantly better priority than one that is fully
cpu bound all the time. Reverting that change is not a solution because it
can still be fooled by the same process sleeping lots for a few seconds or so
at startup and then changing to the cpu mostly-sleeping slightly behaviour.
This "fluctuating" behaviour is in my opinion worse which is why I removed
it.

Cheers,
Con

Attachment: pgp00001.pgp
Description: PGP signature