Re: switch_mm() can fail to load ldt on SMP

From: Linus Torvalds (torvalds@transmeta.com)
Date: Tue Jul 24 2001 - 19:37:34 EST


In article <200107242241.PAA05423@crg8.sequent.com> you write:
>
>We've run into a small bug in switch_mm() which results in a process
>running with a 'stale' ldt.

Good job.

>The first fix would be to patch switch_mm(), so that when the next and
>prev mm pointers are equal, it checks to see if mm->context.segments
>is non-null, if so, it calls load_LDT(). This will unfortunately lead
>to many unnecessary calls to load_LDT(). An enhanced version of this
>fix, would involve introducing a bit array into the mm_struct, one
>bit per cpu. When write_ldt() first allocates the ldt for this mm_struct,
>it would set all bits. Subsequently, in switch_mm(), we could
>introduce a test such as
> if(next->context.segments &&
> test_and_clear_bit(cpu,&next->ldtupdate))load_LDT(next);

This is actually how the "mmu_context" struct is meant to be used: it's
there exactly for per-architecture context bits, and when I did the x86
part I incorrectly thought that the x86 doesn't have any MMU context.
You're obviously right that it has context, and part of the context is
just the list of CPU's that have seen the new LDT.

>Which fix is better depends on the system and application. On a system
>with hundreds of processes sharing the same mm_struct, the first fix
>will result in quite a few calls to load_LDT(). On a system with a large
>number of cpus, and short lived programs using segments, the IPI will
>be wasteful.

Done right, you should have
 - processes with a NULL segment have mm->context.ldtinvalid = 0
 - setldt() sets all bits in "mm->context.ldtinvalid"
 - switch_mm() does (in the CONFIG_SMP "else {" part)
        if (test_and_clear_bit(cpu, &next->context.ldtinvalid))
                load_LDT(next);

which has _no_ extra LDT loads except when required, and only adds one
bit clear-and-test for the one case that needs it (SMP with same mm's)

Would you like to code up this, test it and send it to me?

Btw, good debugging!

                Linus "lazy is my middle name" Torvalds
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Jul 31 2001 - 21:00:20 EST