Re: [BUG] kernel freezes with latest tree

From: Linus Torvalds
Date: Tue Jan 10 2012 - 11:17:14 EST


[ Added Ingo & co to the cc, so I'm leaving things quoted. ]

On Tue, Jan 10, 2012 at 12:16 AM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> Le mardi 10 janvier 2012 à 06:03 +0100, Eric Dumazet a écrit :
>> Le mardi 10 janvier 2012 à 05:57 +0100, Eric Dumazet a écrit :
>> > Hi Linus
>> >
>> > I got some freezes on two different machines, using latest kernel.
>> >
>> > while :; do hackbench 10 thread 4000; done
>> >
>> > Not sure I'll have time today to find the problem.
>> >
>> > It might be related to "perf top" also being run at least once.
>> >
>>
>> Hmm, I can trigger the bug without ever using "perf".
>
> OK I managed to bisect it, but I have to run now.

Good. The bisection certainly helps.

> $ git bisect log
[ snipped ]
> Could it be a merge error ?
>
> commit 0db49b72bce26341274b74fd968501489a361ae3
> Merge: 35b740e 1ac9bc6
> Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Date:   Fri Jan 6 08:33:28 2012 -0800
>
>    Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

That's one of the merges this window that I sent out a query message
for, because I wasn't happy with it, and stopped at that point waiting
for people to validate the end result.

Nobody complained about it, but the first thing to try would be one of
my questions about that merge: can you move the call to

set_cpu_sd_state_idle();

(along with the comment above it) in tick_nohz_idle_enter()
(kernel/time/tick-sched.c) to just *below* the "local_irq_disable()"?

I also questioned the interaction between the sparse cleanups and
usecs_to_cputime64(), but people said they were fine.. But your report
will probably make people double-check.

Guys?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/