Re: cpusets: BUG: cpuset_excl_nodes_overlap() may sleep under tasklist_lock

From: Darrick J. Wong
Date: Mon Jan 09 2006 - 17:41:47 EST


Hi Mr. Jackson,

I've been stress testing 2.6.15 on an IntelliStation Z20 in our lab, and
have seen the same complaints that were described earlier in this thread
(sleeping in the cpuset code under OOM conditions):

Debug: sleeping function called from invalid context at
include/asm/semaphore.h:105
in_atomic():1, irqs_disabled():0

Call Trace:<ffffffff80130640>{__might_sleep+179}
<ffffffff8015bb68>{cpuset_excl_nodes_overlap+31}
<ffffffff80164d9f>{out_of_memory+123}
<ffffffff801676c4>{__alloc_pages+564}
<ffffffff8017ec41>{alloc_pages_current+160}
<ffffffff8016873f>{__do_page_cache_readahead+197}
<ffffffff8010e5c3>{__switch_to+50}
<ffffffff8031c547>{_spin_unlock_irqrestore+52}
<ffffffff80166604>{free_pages_bulk+674}
<ffffffff80168a00>{do_page_cache_readahead+82}
<ffffffff8016355a>{filemap_nopage+332}
<ffffffff8017378f>{__handle_mm_fault+1162}
<ffffffff8031c547>{_spin_unlock_irqrestore+52}
<ffffffff8031df02>{do_page_fault+1096}
<ffffffff801991ee>{sys_select+839} <ffffffff8011078d>{error_exit+0}

Since we seem to be able to reproduce this with some regularity, let me
know if you'd like us to test out any cpuset patch(es).

--Darrick

> Kirill Korotaev wrote:
>> FYI, there is an obvious bug in cpusets in 2.6.15-rcX:
>> cpuset_excl_nodes_overlap() may sleep (as it takes semaphore), but is
>> called from atomic context - select_bad_process() under tasklist_lock.
>> BUG. Found by Denis Lunev.
>
> Sorry for not responding sooner - I was off the air for a week.
>
> Thanks for finding and reporting this.
>
> Apparently, from KUROSAWA Takahiro's report, this bug was also in
> 2.6.14. My initial reading of the code in 2.6.14 and 2.6.15-* agrees,
> and finds that this bug was present since the cpuset_excl_nodes_overlap
> call was added, Sept 8, 2005 (in Linus's tree.)
>
>
>> the same actually applies to cpuset_zone_allowed() which is called e.g.
>> from __alloc_pages()->get_page_from_freelist() and doesn't check for
>> GPF_NOATOMIC anyhow...
>
> I don't think so. Please read the comments in kernel/cpuset.c above
> the routine cpuset_zone_allowed(). Either that routine is called with
> the __GFP_HARDWALL flag set, so returns before it gets to the semaphore
> call, or it is not called at all, due to the check for ATOMIC (!wait)
> in mm/page_alloc.c.
>
> I don't see any bugs like this, in the cpuset_zone_allowed code path.
>
>
> ==> My initial analysis - I have one bug, in the oom_kill path,
> where the code takes callback_sem while holding tasklist_ lock,
> that has been in the main line kernel since 2.6.14.
>
> My first guess is that it will take me about a week, with testing and
> other priorities (including a few more days vacation), to respond with a
> patch. Speak up if that doesn't meet your needs.

Attachment: signature.asc
Description: OpenPGP digital signature