2.2.13 scheduling anomoly

William Montgomery (william@opinicus.com)
Wed, 15 Dec 1999 14:04:54 -0500 (EST)


I am seeing a scheduling problem which occurs once every 1 to 2
hours and does not appear related to system load. After handling
an IRQ (arch/i386) the kernel is somehow missing the need_resched
flag which results from the IRQ (a signal to a task is generated).

I am using the RTC to signal a SCHED_FIFO task at 64Hz.
See annotated ktrace below:

c010a68d do_IRQ +<4d/54> ( 0.43 ) ( 244323.01 elapsed) pid(0)
c0113287 schedule +<13/284> ( 0.41 ) ( 244323.44 elapsed) pid(0)
c011d20d check_pgt_cache +<d/24> ( 0.13 ) ( 244323.85 elapsed) pid(0)
c011055f do_check_pgt_cache +<f/84> ( 1.10 ) ( 244323.98 elapsed) pid(0)
c010a64d do_IRQ +<d/54> ( 0.16 ) ( 244325.08 elapsed) pid(0)
c010a1cf do_8259A_IRQ +<13/b4> ( 2.61 ) ( 244325.24 elapsed) pid(0)
c010a4f4 handle_IRQ_event +<10/78> ( 0.19 ) ( 244327.85 elapsed) pid(0)
c01baf61 rtc_interrupt +<11/9c> ( 4.03 ) ( 244328.04 elapsed) pid(0)
c0113508 __wake_up +<10/58> ( 0.15 ) ( 244332.06 elapsed) pid(0)
c013204a kill_fasync +<e/54> ( 0.18 ) ( 244332.21 elapsed) pid(0)
c0131f5e send_sigio +<16/f4> ( 3.54 ) ( 244332.39 elapsed) pid(0)
^^^---sends sigio to SCHED_FIFO pid(156)

c0111d69 send_sig +<d/2c> ( 0.22 ) ( 244335.93 elapsed) pid(0)
c011180f send_sig_info +<13/28c> ( 0.23 ) ( 244336.15 elapsed) pid(0)
c0111796 ignored_signal +<e/74> ( 0.39 ) ( 244336.38 elapsed) pid(0)
c0112e09 wake_up_process +<d/54> ( 0.24 ) ( 244336.77 elapsed) pid(0)
c0112d82 reschedule_idle +<e/88> ( 6.07 ) ( 244337.02 elapsed) pid(0)
^^^--- pid(156) is put on run-queue and need_resched is set

c011300a mod_timer +<e/188> ( 0.35 ) ( 244343.09 elapsed) pid(0)
c010a109 enable_8259A_irq +<d/3c> ( 0.82 ) ( 244343.44 elapsed) pid(0)
c010a672 do_IRQ +<32/54> ( 0.11 ) ( 244344.26 elapsed) pid(0)
^^^--- look for any bottom halfs to run - none found

c010a68d do_IRQ +<4d/54> ( 7769.96 ) ( 244344.37 elapsed) pid(0)
^^^--- schedule should be run here but somehow
the idle task runs for 7.8msec instead? Did we loose
need_resched or was it not checked?

c010a64d do_IRQ +<d/54> ( 0.13 ) ( 252114.33 elapsed) pid(0)
c010a1cf do_8259A_IRQ +<13/b4> ( 2.59 ) ( 252114.46 elapsed) pid(0)
c010a4f4 handle_IRQ_event +<10/78> ( 0.15 ) ( 252117.06 elapsed) pid(0)
c01baf61 rtc_interrupt +<11/9c> ( 4.17 ) ( 252117.21 elapsed) pid(0)
c0113508 __wake_up +<10/58> ( 0.15 ) ( 252121.37 elapsed) pid(0)
c013204a kill_fasync +<e/54> ( 0.27 ) ( 252121.52 elapsed) pid(0)
c0131f5e send_sigio +<16/f4> ( 3.64 ) ( 252121.79 elapsed) pid(0)
^^^--- *another* sigio to pid(156)

c0111d69 send_sig +<d/2c> ( 0.21 ) ( 252125.43 elapsed) pid(0)
c011180f send_sig_info +<13/28c> ( 0.23 ) ( 252125.64 elapsed) pid(0)
c0111796 ignored_signal +<e/74> ( 5.43 ) ( 252125.86 elapsed) pid(0)
c011300a mod_timer +<e/188> ( 0.31 ) ( 252131.30 elapsed) pid(0)
c010a109 enable_8259A_irq +<d/3c> ( 0.81 ) ( 252131.61 elapsed) pid(0)
c010a672 do_IRQ +<32/54> ( 0.14 ) ( 252132.42 elapsed) pid(0)
c010a68d do_IRQ +<4d/54> ( 0.36 ) ( 252132.56 elapsed) pid(0)
c0113287 schedule +<13/284> ( 0.58 ) ( 252132.92 elapsed) pid(0->156)
^^^--- normal behaviour

On inspection of the IRQ code, it looks like when do_IRQ is done
a jump to ret_from_intr is executed which should then jump to
ret_with_resched (in my case) and see the need_resched flag and
schedule pid(156). Is this correct?

Any ideas on how the ktrace above occurs?

Wm

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/