Re: 2.6.25 crash: EIP: [<c02e2f14>] xfrm_output_resume+0x64/0x100 ss:esp 0068:c03a1e5c

From: Marco Berizzi
Date: Tue May 06 2008 - 06:47:23 EST


Marco Berizziwrote:


> Herbert Xu wrote:
>
> > Marco Berizzi <pupilla@xxxxxxxxxxx> wrote:
> >> Just few minutes ago, another 2.6.25 crash
> >> with this message:
> >>
> >> EIP: [<c028000a>] netif_rx+0x6a/0xd0 SS:ESP 0068:c039f868
> >>
> >> http://80.204.235.230/4.jpg
> >
> > OK, the xfrm_output_resume trail revealed nothing. Combined
> > with this crash however, it would appear that you've got a problem
> > with live skbs being freed. Unfortunately such problems are
> > difficult to track down.
> >
> > Perhaps you could enable SLAB debugging to see if we can get
> > closer to the culprit?
>
> Yes indeed, I will try next monday.
>
> > Alternatively you could try a git bisection.
>
> Yes, but it could take forever... :-((
> As I said, I have removed the sfq and htb modules,
> and I would like to wait one week to see if these
> boxes will crash. I have 8 linux boxes running 2.6.25
> with the same .config and only the two with htb/sfq
> qdisc are crashing. Just few minutes ago I have seen
> this message forwarded to netdev by David Miller:
>
> Re: PROBLEM: kernel lockup while changing TC rules
>
> it is talking about htb/sfq. Maybe it is related.
>
> > Unless we have a way of reproducing this there isn't a lot more
> > that we can do I'm afraid.
>
> thanks anyway Herbert.

ok, I can confirm after one week uptime: this problem
is not happening after removing sch_htb/sch_sfq and
cls_fw modules.
What should I do to help track down this problem?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/