Re: P-MTU discovery

From: jamal (hadi@cyberus.ca)
Date: Fri Apr 21 2000 - 15:23:53 EST


On Fri, 21 Apr 2000, Theodore Y. Ts'o wrote:

> Date: Fri, 21 Apr 2000 11:55:04 -0400 (EDT)
> From: jamal <hadi@cyberus.ca>
>
> I dont know whether telcos are already doing this, but we certainly are in
> Linux. I point the finger to Marc Boucher. He did it!
>
> Ah, I hadn't realized someone had done it already. Is it in ipchains?
>

Both ipchains and netfilter:
http://www.davin.ottawa.on.ca/pppoe/
I know it is also a separate package in netwinder.org somewhere
look for something with "mssfwclamp" on this pppoed packaging.
 

> The reason is very simple: NAT that good old friend of IPSEC.
> When you have lotsa boxes that you are masquareding for it is hell to go
> around and start changing their MTU values or doing any sort of per-box
> changes.
>
> Actually, the hack is useful even if you're not doing NAT; any time you
> have a configuration where you have a gateway box which is doing some
> kind of tunnelling (either PPPOE or IP-IP or something else), and you
> have lots of client machines behind the tunnel end-pointing, making lots
> of per-box changes a pain.
>
> Here's the problem. End2end is great design principle, but it
> fundamentally assumes that the intelligence is at the endpoints, and the
> middle of the network isn't supposed to do anything special/magical.
> But as the internet gets bigger and bigger, trying to change all of the
> endpoints to add security, or to handle paths with long latencies
> efficiently, gets harder and harder.

It gets easier when some big end system boy does it (as in the re-birth
of RSVP). Just a std disclaimer these are my own personal comments and
have nothing whatsoever to do with my employer.

> And so, it gets easier to make
> changes in the middle of the network. And most of the (to use Rusty's
> phrase) "packet fucking" techniques come from this dilemma: NAT's
> (easier than IPV6), firewalls (easier than doing real end-point
> security), tcp ack spoofing (easier than upgrading Windows TCP stacks to
> make them work correctly over satellite links), etc.

I think protocol layering violation will continue for a long time because
of these kludges which are a result of fixes for "immediate problems". The
solutions can be deployed faster. Put a box infront of all these end
systems and they dont have to know anything about it. And these boxes
stay forever and then it becomes quiet a simple rule: If it aint broken
dont fix it. NAT might really delay IPV6 for example. So i think, maybe
instead of preaching end2end principles its best to preach to protocol
authors to be on the lookout for these kind of hacks. You are not gonna
stop all those "content switching/application routing" startups by
preaching religion. They are out there to make a lot of money.

>
> Having said that, there could be an alternative solution in Linux. The
> PPPOE code could be made, after dropping the packet, to generate ICMP "too
> big" messages back to the masquareded boxes instead (when packet-size
> >PMTU-shim_header). Hopefully, the win* boxes know what to do with these
> messages. And this will work also for UDP. Marc?
>
> That doesn't help. We're doing this today already; it's required by the
> RFC's, after all. The problem is that the sender of the big packet has
> to receive the ICMP, and if there's something filtering the ICMP
> message, you're stuck.

True. I stand corrected. Forget what i said, Marc.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Apr 23 2000 - 21:00:19 EST