Re: CONFIG_PREEMPT and server workloads

From: Takashi Iwai
Date: Thu Mar 18 2004 - 14:17:48 EST


At Thu, 18 Mar 2004 11:01:59 -0800,
Andrew Morton wrote:
>
> Takashi Iwai <tiwai@xxxxxxx> wrote:
> >
> > for ext3, these two spots are relevant.
> >
> > --- linux-2.6.4-8/fs/jbd/commit.c-dist 2004-03-16 23:00:40.000000000 +0100
> > +++ linux-2.6.4-8/fs/jbd/commit.c 2004-03-18 02:42:41.043448624 +0100
> > @@ -290,6 +290,9 @@ write_out_data_locked:
> > commit_transaction->t_sync_datalist = jh;
> > break;
> > }
> > +
> > + if (need_resched())
> > + break;
> > } while (jh != last_jh);
> >
> > if (bufs || need_resched()) {
>
> This one I need to think about. Perhaps we can remove the yield point a
> few lines above.

yes, i'm afraid that it's also overkill to check this at every time.
perhaps we can optimize it a bit better. the fact that it imporives
the latency means that there are so many locked buffers or non-dirty
buffers in the list?

> One needs to be really careful with the lock-dropping trick - there are
> weird situations in which the kernel fails to make any forward progress.
> I've been meaning to do another round of latency tuneups for ages, so I'll
> check this one out, thanks.
>
> There's also the SMP problem: this CPU could be spinning on a lock with
> need_resched() true, but the other CPU is hanging on the lock for ages
> because its need_resched() is false.

yep, i see a similar problem also in reiserfs's do_journal_end().
it's in lock_kernel().

> In the 2.4 ll patch I solved that via
> the scary hack of broadcasting a reschedule instruction to all CPUs if an
> rt-prio task just became runnable. In 2.6-preempt we use
> preempt_spin_lock().
>
> But in 2.6 non-preempt we have no solution to this, so worst-case
> scheduling latencies on 2.6 SMP CONFIG_PREEMPT=n are high.

hmm, it's bad...

> Last time I looked the worst-case latency is in fact over in the ext3
> checkpoint code. It's under spinlock and tricky to fix.

BTW, i had the worst latency in sis900's timer handler.
it takes 3ms, and hard to fix, too :-<


Takashi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/