Re: 2.6.32.21 - uptime related crashes?

From: Nicolas Carlier
Date: Sat May 14 2011 - 19:13:53 EST


Hi,

On Sat, May 14, 2011 at 10:45 PM, Willy Tarreau <w@xxxxxx> wrote:
> Hi,
>
> On Sat, May 14, 2011 at 09:04:23PM +0200, Nikola Ciprich wrote:
>> Hello gentlemans,
>> Nicolas, thanks for further report, it contradicts my theory that problem occured somewhere during 2.6.32.16.
>
> Well, I'd like to be sure what kernel we're talking about. Nicolas said
> "2.6.32.8 Debian Kernel", but I suspect it's "2.6.32-8something" instead.
> Nicolas, could you please report the exact version as indicated by "uname -a" ?


Sorry, I can't provide more informations on this version because I
don't use it anymore, I can just corrected myself, it was not a
2.6.32.8 kernel but a 2.6.32.7 backport debian kernel, which had been
recompiled.

Because of this problem I took the oportunity to change to a 2.6.32.26
kernel, however as there was nothing on the changelog or bugzilla
about the resolution of this issue we have applied the patch found in
bugzilla which revealed this problem:

https://bugzilla.kernel.org/show_bug.cgi?id=16991#c17

>
>> Now I think I know why several of my other machines running 2.6.32.x for long time didn't crashed:
>>
>> I checked bugzilla entry for (I believe the same) problem here:
>> https://bugzilla.kernel.org/show_bug.cgi?id=16991
>> and Peter Zijlstra asked there, whether reporters systems were running some RT tasks. Then I realised that all of my four crashed boxes were pacemaker/corosync clusters and pacemaker uses lots of RT priority tasks. So I believe this is important, and might be reason why other machines seem to be running rock solid - they are not running any RT tasks.
>> It also might help with hunting this bug. Is somebody of You also running some RT priority tasks on inflicted systems, or problem also occured without it?
>
> No, our customer who had two of these boxes crash at the same time was
> not running any RT task to the best of my knowledge.
>

Regards,

--
Nicolas Carlier
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/