Re: [PATCH 09/10] s390/cputime: delayed accounting of system time

From: Frederic Weisbecker
Date: Tue Dec 13 2016 - 21:38:53 EST


On Tue, Dec 13, 2016 at 02:21:21PM +0100, Martin Schwidefsky wrote:
> On Tue, 13 Dec 2016 12:13:22 +0100
> Martin Schwidefsky <schwidefsky@xxxxxxxxxx> wrote:
>
> > On Mon, 12 Dec 2016 16:02:30 +0100
> > Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> >
> > > On Mon, Dec 12, 2016 at 11:27:54AM +0100, Martin Schwidefsky wrote:
> > > > 3) The call to vtime_flush in account_process_tick is done in irq context from
> > > > update_process_times. hardirq_offset==1 is also correct.
> > >
> > > Let's see this for example:
> > >
> > > + if ((tsk->flags & PF_VCPU) && (irq_count() - hardirq_offset == 0))
> > > + S390_lowcore.guest_timer += timer;
> > >
> > > If the tick is interrupting guest, we have accounted the guest time on tick IRQ entry.
> > > Now we are in the middle of the tick interrupt and since hardirq_offset is 1, we
> > > are taking the above path by accounting half of the tick-IRQ time as guest, which is wrong,
> > > it's actually IRQ time.
> >
> > Hmm, you got me there. The system time from irq_enter until account_process_tick
> > is reached is indeed IRQ time. It is not much but it is incorrect. The best fix
> > would be to rip out the accounting of the system time from account_process_tick
> > as irq_enter / irq_exit will do system time accounting anyway. To do that
> > do_account_vtime needs to be split, because for the task switch we need to
> > account the system time of the previous task.
>
> New patch for the delayed cputime account. I can not just rip out system time
> accounting from account_process_tick after all, I need a sync point for the
> steal time calculation. It basically is the same patch as before but with a new
> helper update_tsk_timer, the removal of hardirq_offset and a simplification
> for do_account_vtime: the last accounting delta is either hardirq time for
> the tick or system time for the task switch.
>
> Keeping my finger crossed..

The patch looks good. But you might want to remove the hardirq_offset in a
separate patch to queue for this merge window (I'm not sure if it needs a
stable tag, the argument may be there since the beginning).

Because the rest depends on the series that is unlikely to be queued in this
merge window at this stage.

Thanks!