bridging fix? Which (fwd)

Peter T. Breuer (ptb@it.uc3m.es)
Thu, 22 Oct 1998 05:44:55 +0200 (MET DST)


[Cc: to linux-kernel, in case some other network guru knows and alan
is busy]

Hello Alan

I know you finally found and fixed the bridging code leak somewhere in
the recent 2.0.36pre series or just before. But I haven't been able to
figure out what the fix was, by inspection. I would be deeply grateful
if you could tell me what the line or lines was .. I think some
intrepid soul managed to find the line that did the damage using the
memleak patches.

I saw the leak first on a server I had at 2.0.33. P100 with 3c905
and buslogic fast and wide. It went down in a week serving NFS. You
steered me to the cause and workaround.

I disabled bridging on the kernel (it only had one card) and it became
stable as a rock.

At the same time - several months ago now - I took the kernel and put it
in the server next door to it, P200 with a 3c900 and adaptec fast and
narrow. That has been stable as anything. Same binary kernel. Not
much NFS load.

Now I (by mistake) took the same binary kernel and put it in a PP200
serving heavy NFS through a single 3c905 on a 100BT net with adaptec
fast and wide scsi. That went down in 24 hours with all it's 128M
memory used up - no user space usage to speak off. It was running
mrouted and an mbone tunnel when it died.

It was clearly "network buffer" leakage. But I tried _enabling_ its
dormant bridging code near the end, and it went down in 20 minutes.
When it came back up fresh I tried the enable again, and it went down
again in about 10 mins while I watched - with lots of network pauses.
It's now looking stable on a recompiled kernel without bridging, same
configuration otherwise.

I would love to have just the fix for this as a patch. I know that's
too much to ask, so I'm asking just to be clued in on the eureka that
solved this.

Thank you if you can manage that ...

Peter

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/