timer_bh race fixed

William Montgomery (william@opinicus.com)
Wed, 15 Dec 1999 09:38:12 -0500 (EST)


Thanks for the help and patch suggestions. After much testing, the
timer_bh appears to always run after a timer_interrupt. I have
captured a new scheduling problem, the source of this problem has eluded
me for several days now. I removed the bottom half patches to make sure
they didnt induce this behavior, I then used the rtc dev as a timing
source to avoid the bh race. The problem I see occurs with _and_
without the bh patches. See ktrace below:

30389.58 0.54 do_IRQ pid(1936)
30390.12 3.28 do_8259A_IRQ pid(1936)
30393.41 0.46 handle_IRQ_event pid(1936)
30393.87 4.11 rtc_interrupt pid(1936)
30397.97 0.46 __wake_up pid(1936)
30398.43 0.82 kill_fasync pid(1936)
30399.25 6.70 send_sigio pid(1936)
^^^---a signal is sent to pid(154)
30405.95 0.61 send_sig pid(1936)
30406.56 1.14 send_sig_info pid(1936)
30407.71 1.15 ignored_signal pid(1936)
30408.86 0.40 wake_up_process pid(1936)
30409.25 9.85 reschedule_idle pid(1936)
^^^---pid(154) is put on run-queue and need_resched = 1
30419.11 0.87 mod_timer pid(1936)
30419.97 0.74 enable_8259A_irq pid(1936)
30420.71 0.19 do_IRQ pid(1936)
30420.90 0.36 do_IRQ pid(1936)
30421.26 1.48 schedule pid(1936->154)
^^^---results from ret_from_intr -> ret_with_resched
30422.73 1.93 __switch_to pid(154)

--snip-- [ktrace above is normal]

183326.36 0.37 do_IRQ pid(1936)
183326.73 3.40 do_8259A_IRQ pid(1936)
183330.13 0.34 handle_IRQ_event pid(1936)
183330.47 4.23 rtc_interrupt pid(1936)
183334.70 0.36 __wake_up pid(1936)
183335.06 0.35 kill_fasync pid(1936)
183335.40 5.89 send_sigio pid(1936)
^^^--- send sigio to pid(154)
183341.29 0.27 send_sig pid(1936)
183341.57 0.60 send_sig_info pid(1936)
183342.17 0.65 ignored_signal pid(1936)
183342.82 0.36 wake_up_process pid(1936)
183343.18 9.50 reschedule_idle pid(1936)
^^^--- put on run-queue, set need_resched
183352.68 0.80 mod_timer pid(1936)
183353.48 0.85 enable_8259A_irq pid(1936)
183354.33 0.14 do_IRQ pid(1936)
183354.46 4918.00 do_IRQ pid(1936)
^^^---pid(1936) is a SCHED_OTHER task, pid(154) should have
been scheduled
188272.46 0.45 system_call pid(1936)
188272.91 2.80 sys_write pid(1936)
188275.72 1.86 pipe_write pid(1936)
188277.58 1.58 schedule pid(1936->154)

What happened to ret_from_intr and ret_with_resched in the second
ktrace block? It appears that pid(1936) is merrily executing in user
space for 4.9msec. Any ideas?

Wm

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/