Re: [PATCH 0/3] minor cleanups to EFLAGS initialisation inret_from_fork

From: Ian Campbell
Date: Wed Aug 10 2011 - 11:27:36 EST


On Tue, 2011-07-26 at 15:46 +0100, Peter Zijlstra wrote:
> On Tue, 2011-07-26 at 01:47 +0400, Cyrill Gorcunov wrote:
> > > > schedule (sched.c)
> > > > ...
> > > > raw_spin_lock_irq
> > > > ...
> > > > context_switch
> > > > switch_to
> > > > "jnz ret_from_fork\n\t"
> > > > pushq_cfi kernel_eflags(%rip)
> > > > popfq_cfi # reset kernel eflags
> > > >
> > > > ---> irqs are still disabled
> > > >
> > > > call schedule_tail # rdi: 'prev' task parameter
> > > > finish_lock_switch
> > > > raw_spin_unlock_irq
> > > >
> > > > I bet raw_spin_lock_irq at the beginning of the schedule() is set
> > > > for a reason and such change is not safe. Though I may be missing
> > > > something again...
> > > >
> > >
> > > This definitely doesn't look "obviously safe" to me. However, does
> > > anyone see a problem with unconditionally leaving IF disabled even on 32
> > > bits (I haven't traced all the paths yet), i.e. doing the *opposite* of
> > > Ian's patch #2?
>
> Right, enabling IRQs there isn't cool, currently there's still
> __ARCH_WANT_INTERRUPTS_ON_CTXSW but we're working hard on getting rid of
> that nightmare.
>
> There's a number of very subtle things that can go wrong when you enable
> interrupts over the context switch.
>
> Leaving IRQs disabled should be the right thing, on x86 we should
> _never_ have interrupts enabled there.

Just getting back to this after my vac, sorry for the delay.

raw_spin_unlock_irq unconditionally re-enables interrupts so I don't
really see what I've changed since interrupts are enabled by
schedule_tail and I've moved (on 64 bit) the EFLAGS reset to after
schedule_tail, so it should have interrupts enabled at that point
already and so they should remain enabled. Or are you suggesting that
things were already wrong?

However I've switched the order of my second patch anyway, so EFLAGS is
reset to 0x0002 (interrupts disabled) on both 32- and 64-bit before the
call to schedule_tail, since it does seem like the simpler option. I'll
repost shortly.

Ian.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/