Re: [PATCH 0/3] minor cleanups to EFLAGS initialisation inret_from_fork

From: Cyrill Gorcunov
Date: Mon Jul 25 2011 - 17:47:46 EST


On Mon, Jul 25, 2011 at 02:10:02PM -0700, H. Peter Anvin wrote:
> On 07/25/2011 11:20 AM, Cyrill Gorcunov wrote:
> > On Mon, Jul 25, 2011 at 02:19:02PM +0400, Cyrill Gorcunov wrote:
> >> On Mon, Jul 25, 2011 at 10:58:03AM +0100, Ian Campbell wrote:
> >>> The following series removes the use of a global kernel_eflags variable
> >>> from the x86_64 ret_from_fork path and (very slightly) merges the 32 and
> >>> 64 bit version of that code path.
> >>>
> >>> kernel_eflags could be made a __read_mostly but actually there is no
> >>> reason to prefer the value at cpu_init() time to a compile time constant
> >>> value for the initial eflags after a fork.
> >>>
> >>> Ian.
> >>>
> >>
> >> Thanks, Ian! I think noone against this simplification, Peter, Andi?
> >>
> >> Cyrill
> >
> > Ian, I've missed in first place that you've opened IRQs window _before_
> > schedule_tail() call, ie it's not 1:1 code mapping as it was before.
> >
> > Note kernel_eflags has IF clear and what we have: the ret_from_fork on
> > x86-64 happens _only_ inside context_switch call, ie
> >
> > schedule (sched.c)
> > ...
> > raw_spin_lock_irq
> > ...
> > context_switch
> > switch_to
> > "jnz ret_from_fork\n\t"
> > pushq_cfi kernel_eflags(%rip)
> > popfq_cfi # reset kernel eflags
> >
> > ---> irqs are still disabled
> >
> > call schedule_tail # rdi: 'prev' task parameter
> > finish_lock_switch
> > raw_spin_unlock_irq
> >
> > I bet raw_spin_lock_irq at the beginning of the schedule() is set
> > for a reason and such change is not safe. Though I may be missing
> > something again...
> >
>
> This definitely doesn't look "obviously safe" to me. However, does
> anyone see a problem with unconditionally leaving IF disabled even on 32
> bits (I haven't traced all the paths yet), i.e. doing the *opposite* of
> Ian's patch #2?
>
> -hpa
>

On x86-32 it seems to be similar (not identical in calls though)

copy_thread()
p->thread.ip = (unsigned long)ret_from_fork;

and the task get queued into tasks queue, but later when switch_to
happens irqs are blocked at ret_from_fork call. I better poke PeterZ
here /CC'ed/ ;)

Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/