Re: [PATCH] Revert "timers: Don't wake ktimersoftd on every tick"

From: Haris Okanovic
Date: Fri Jun 02 2017 - 10:37:21 EST


On 05/26/2017 03:50 PM, Thomas Gleixner wrote:
On Fri, 26 May 2017, Haris Okanovic wrote:

Oh crap. I think I see the problem. I decrement expired_count before
processing the list. Dropping the lock permits another run of
tick_find_expired()->find_expired_timers() in the middle of __expire_timers()
since it uses expired_count==0 as a condition.

This should fix it, but I'll wait for Anna-Maria's test next week before
submitting a patch.

static void expire_timers(struct timer_base *base)
{
struct hlist_head *head;
+ int expCount = base->expired_count;

No camel case for heavens sake!

And this requires:

cnt = READ_ONCE(base->expired_count);

- while (base->expired_count--) {
- head = base->expired_lists + base->expired_count;
+ while (expCount--) {
+ head = base->expired_lists + expCount;
__expire_timers(base, head);
}

Plus a comment.

Fixed, thanks.

Are your recommending READ_ONCE() purely for documentation purposes?
All reads and writes to base->expired_count happen while base->lock is held. It just can't reach zero until expired_lists is ready to be rewritten.


base->expired_count = 0;

Anna-Maria spotted the same issue, but I voted for the revert right now
because I was worried about the consistency of base->clk under all
circumstances.

The other thing I noticed was this weird condition which does not do the
look ahead when base->clk is back for some time.

The soft interrupt fires unconditionally if base->clk hasn't advanced in some time to limit how long cpu spends in hard interrupt context.

Why don't you use the
existing optimization which uses the bitmap for fast forward?


Are you referring to forward_timer_base()/base->next_expiry? I think it's only updated in the nohz case. Can you share function name/line number(s) if you're thinking of something else.

The other issue I have is that this can race at all. If you raised the
softirq in the look ahead then you should not go into that function until
the softirq has actually completed. There is no point in wasting time in
the hrtimer interrupt if the softirq is running anyway.


Makes sense. Skipping the large `if` block in run_local_timers() when `local_softirq_pending() & TIMER_SOFTIRQ`.

Thanks,

tglx





I also ran Anna-Maria's test for 12h without failure; I.e. no "Stalled" messages. It fails withing 10-15m on my qemu VM without the fix (4-core Nehalem, 1GB RAM).

You can view a diff at
https://github.com/harisokanovic/linux/tree/dev/hokanovi/timers-race

-- Haris