Re: oops in __release_sock() [2.0.35]

Andrea Arcangeli (andrea@e-mind.com)
Sat, 3 Oct 1998 19:33:53 +0200 (CEST)


On Fri, 2 Oct 1998, Michael L. Galbraith wrote:

>The test had been running for 6 hrs. at Oops time. I've restarted webtest,
>and it's running fine again.. will see if it repeats.

I spent one hour trying to understand what is causing the Oops.

The problem is that a sock is been just kfreed and reused from other
pieces of code at release_sock() time.

It would be nice if you could reproduce the Oops with this additional
debugging code:

--- linux-2.0.35/net/ipv4/tcp.c~ Thu Jun 18 23:48:22 1998
+++ linux-2.0.35/net/ipv4/tcp.c Sat Oct 3 18:12:38 1998
@@ -1967,6 +1967,9 @@
{
struct sk_buff *skb;

+ struct sock saved_sk = *sk;
+ barrier();
+
/*
* We need to grab some memory, and put together a FIN,
* and then put it into the queue to be sent.
@@ -2046,6 +2049,11 @@
tcp_reset_msl_timer(sk, TIME_CLOSE, TCP_FIN_TIMEOUT);
}

+ if (sk->prot != &tcp_prot)
+ printk("catched __release_sock() bug: state %d, users: %d, "
+ "dead %d, destroy %d, retransmits %ld\n",
+ saved_sk.state, saved_sk.users, saved_sk.dead,
+ saved_sk.destroy, saved_sk.retransmits);
sk->dead = 1;
release_sock(sk);

With this patch applyed you should see the debugging printk before the
Oops.

Andrea[s] Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/