Re: [patch] CFS scheduler, -v19

From: Ingo Molnar
Date: Thu Jul 19 2007 - 05:27:49 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> > ah! It passes in a low-res time source into a high-res time
> > interface (pthread_cond_timedwait()). Could you change the
> > time(NULL) + 1 to time(NULL) + 2, or change it to:
> >
> > gettimeofday(&wait, NULL);
> > wait.tv_sec++;
>
> This is wrong. It's wrong for two reasons:
>
> - it really shouldn't be needed. I don't think "time()" has to be
> *exactly* in sync, but I don't think it can be off by a third of a
> second or whatever (as the "30% CPU load" would seem to imply)
>
> - gettimeofday works on a timeval, pthread_cond_timedwait() works on a
> timespec.

ah, i didnt notice that automount mixed up timespec with timeval! That
is nasty and the tv_nsec field (which really is ts_usec to
pthread_cond_timewait()) must stay cleared - or rather, to avoid bugs of
this type, a timespec variable should be used for all this.

> So if it actually makes a difference, it makes a difference for the
> *wrong* reason: the time is still totally nonsensical in the tv_nsec
> field (because it actually got filled in with msecs!), but now the
> tv_sec field is in sync, so it hides the bug.
>
> Anyway, hopefully the patch below might help. But we probably should make
> this whole thing a much more generic routine (ie we have our internal
> "getnstimeofday()" that still is missing the second-overflow logic, and
> that is quite possibly the one that triggers the "30% off" behaviour).

yeah, i'll generalize it, but our internal getnstimeofday() used on most
architectures is using __get_realtime_clock_ns(), and the patch you
attached already adds the second-overflow logic to it.

there are two versions of getnstimeofday(), a TIME_INTERPOLATION one and
a !TIME_INTERPOLATION one. TIME_INTERPOLATION is only used on ia64 at
the moment - and that one indeed does not have the second overflow
logic.

> Ingo, I'd suggest:
> - ger rid of "timespec_add_ns()", or at least make it return a return
> value for when it overflows.
> - make all the people who overflow into tv_sec call a "fix_up_seconds()"
> thing that does the xtime overflow handling.

ok, i'll do something clean.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/