Re: time / gtod seconds value out of sync?

From: Jan Stancek
Date: Fri Feb 20 2015 - 08:32:06 EST






----- Original Message -----
> From: "Nishanth Aravamudan" <nacc@xxxxxxxxxxxxxxxxxx>
> To: "John Stultz" <john.stultz@xxxxxxxxxx>
> Cc: "Thomas Gleixner" <tglx@xxxxxxxxxxxxx>, "lkml" <linux-kernel@xxxxxxxxxxxxxxx>, jstancek@xxxxxxxxxx, "Ingo Molnar"
> <mingo@xxxxxxxxxx>
> Sent: Thursday, 19 February, 2015 8:28:40 PM
> Subject: Re: time / gtod seconds value out of sync?
>
> Hi John!
>
> On 19.02.2015 [11:03:26 -0800], John Stultz wrote:
> > Hey Nish! Long time!
>
> yep :)
>
> > On Thu, Feb 19, 2015 at 10:35 AM, Nishanth Aravamudan
> > <nacc@xxxxxxxxxxxxxxxxxx> wrote:
> > > Hi John,
> > >
> > > We're seeing an interesting issue with the openposix testcase
> > > difftime/1-1, which basically calls gtod/time, sleeps, calls time/gtod,
> >
> > I'm not familiar with the test... Do you have a link?
>
> Sorry about that:
>
> http://sourceforge.net/projects/posixtest/files/posixtest/posixtestsuite-1.5.2/posixtestsuite-1.5.2.tar.gz/download
>
> conformance/interfaces/difftime/1-1.c
>
> It takes quite a few iterations in a loop, but it eventually fails with
> mainline on both x86-64 and ppc64le.
>
> > > then difftime and sees if they disagree. The issue occurs with either
> > > vDSO implementations or direct syscalls.
> > >
> > > We are seeing failures on ppc64le and x86_64 (probably other places too,
> > > just not tested yet), because (I'm pretty sure), the time() syscalls
> > > granularity is not accounting for the nsecs value at all. Instead, it
> > > just returns get_seconds().
> >
> >
> > Right, so there is always a problem mixing calls of different
> > granularity (similar issues crop up with gettimeofday() and filesystem
> > timestamps), so the basis of the test worries me a little bit from the
> > description, but I'd have to look at it to really get a sense.
>
> Yep, that makes sense to me too -- the only concern I had was that the
> point where time() is returning X seconds, a simultaneous (theoretical)
> call to gtod() would return the correct X+1 seconds (presuming nsecs had
> exceeded 1000000000).
>
> > > In one case, I see, in sys_time():
> > >
> > > [ 313.001823] NACC: timekeeping_get_ns = 1000121642
> > > [ 314.001889] NACC: timekeeping_get_ns = 188401
> > >
> > > gtod correctly accumulates those nsecs into the secs value:
> > >
> > > ts->tv_sec = tk->xtime_sec;
> > > nsecs = timekeeping_get_ns(&tk->tkr);
> > > ts->tv_nsec = 0;
> > > timespec64_add_ns(ts, nsecs);
> > >
> > > but time() does:
> > >
> > > return tk->xtime_sec;
> > >
> > > It seems like overkill to do the full timekeeping_get_ns() in time(),
> >
> > Right, so looking into the git history,
> > f20bf6125605acbbc7eb8c9420d7221c91aa83eb (time: introduce
> > xtime_seconds) changed this specifically for performance reasons
> > (cc'ed Ingo here, in case he remembers more context).
>
> Ah I see. I can see the performance impact of calling into gtod being
> high.
>
> > The idea that time() would be ok as being HZ granular, and its been
> > this way since 2.6.23. Thus you have a < HZ sized window where
> > gettimeofday() will return the next second before time() gets updated
> > by the tick.

Thanks John for the explanation, this made it clear for me.
I wonder if this wouldn't be helpful NOTE in time(2) man page.

>
> Right.
>
> > > but maybe it's also necessary to account for leap seconds? That is, we
> > > need to ensure that accumulate_nsecs_to_secs() has been called before
> > > return tk->xtime_sec?
> >
> > So leapseconds are also applied at tick time, so I don't think you'd
> > see any different behavior with them.
>
> Yep, ok.
>
> > There was a thread on this quite awhile back and I if I recall I think
> > the general consensus was to keep time() tick granular (so it aligns
> > with filesystem timestamps) and gettimeofday() hardware granular.
> > Though we also introduced the CLOCK_REALTIME_COARSE to match
> > sub-second filesystem timestamps as well.
> >
> > So yea... I don't think we want to make a change here, but maybe I'm
> > not understanding the underlying issue... so feel free to push back
> > here. :)
>
> Oh that's fine. I mostly wanted the subsystem experts to chime in on if
> the the testcase was valid, etc.
>
> Jan, do you have any other concerns?

No, I don't.

Regards,
Jan

>
> Thanks,
> Nish
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/