Re: arp, kernel 2.2.15 and 2.3.99-pre6

From: Julian Anastasov (uli@linux.tu-varna.acad.bg)
Date: Mon May 08 2000 - 17:19:36 EST


        Hello,

On Mon, 8 May 2000, Andrey Savochkin wrote:

> Hello,
>
> On Sun, May 07, 2000 at 02:54:24PM +0300, Julian Anastasov wrote:
> > If someone explain me how the arpfilter will solve
> > the following problem I will be happy:
> >
> > ROUTER
> > 192.168.0.1
> > |LAN
> > +---------+-------------+
> > |192.168.0.2 |192.168.0.3
> > Director Real server
> >
> > VIP=192.168.0.4
> >
> > ROUTER
> > eth0: 192.168.0.1/24
> > eth1: link to the external clients
> >
> > Director
> > eth0: 192.168.0.2/24
> > eth0:0 192.168.0.4/32
> > default gw: 192.168.0.1 through eth0 (ROUTER)
> >
> > Real server:
> > eth0: 192.168.0.3/24
> > dummy0: 192.168.0.4/32 (must not be advertised, hidden=1)
> > default gw: 192.168.0.1 through eth0 (ROUTER)
> >
> > The traffic:
> >
> > Client: 10.0.0.1:1024 -> 192.168.0.4:80, received in the router
> >
> > Router: who-has 192.168.0.4 tell 192.168.0.1
> >
> > Director: 192.168.0.4 is at my eth0 MAC
> > #Real server must not reply to the router's broadcast
> > #because 192.168.0.4 is a shared address. This can't
> > #be achieved from arpfilter because the route to
> > #192.168.0.1 (the ROUTER) is never changed. The only
> > #difference is the device where the VIP is configured.
> > #If the real server replies here we have a mess -
> > #the Router starts talking to one of the real server!
>
> Andi has explained how to do it.
> You may implement it as something like

        Actually, the transparent proxy behavior is
preferred. These (hidden) addresses not to be local
addresses but TCP/UDP/ICMP packets to them to be accepted
locally. By this way they are not announced in the ARP
requests but we can use them to accept connections.

        I must say that my example setups are working with
using the transparent proxy support in the real servers.
Fully and without the hidden feature. But it is slower and
ugly - we have many connections. We need something similar
but faster.

> ip rule add pref 100 from 192.168.0.4 table 100
> ip route add table 100 blackhole 192.168.0.1/32
>
> You are blocking all packets with source 192.168.0.4 and destination
> 192.168.0.1, but that's not a big deal.

        Wonderful! But the working variant is:

# Block ARP probes (any traffic) from the local net
ip rule add prio 100 from 192.168.0/24 iif eth0 table 100
ip route add table 100 blackhole 192.168.0.4

# Now deliver locally the traffic from non-LAN clients
ip rule add prio 101 from 0/0 iif eth0 table 101
ip route add table 101 local 192.168.0.4 dev eth0

        We can play with the fwmark too.

        The drawback is that we can't support clients on the
LAN. And I have to block all possible logical networks in
many rules pointing to table 100 :( But this is not a
problem.

        There is no problem in arp_solicit with the above ip
commands. We just fallback to inet_select_adr for our
probes because 192.168.0.1->192.168.0.4 points to the
blackhole and fib_lookup returns -EINVAL.

        But your change in arp_solicit breaks again
everything! If you return 0 after a failed fib_lookup the
ARP requests announce the saddr from the skb and in our
case this is 192.168.0.4.

        Please, explain which is the correct status from
fib_local_source when fib_lookup fails.

> And with such routing table arpfilter patch will block all ARP responses
> about 192.168.0.4 address for requests from 192.168.0.1.
>
> As I have said, the only clear alternative solution is to introduce special
> flags for IP addresses (something like "do not announce me in ARP"),

        If fib_local_source will return 0 after the failed
fib_lookup we have to introduce this feature
(default_arp_src). If fib_local_source is wrong we don't
need such feature. The rule 192.168.0.1->192.168.0.4 is not
RTN_LOCAL because it points to the blackhole.

> or to create special ARP response table to select what to answer for
> particular requests and whether answer at all.
>
> > On Sun, 7 May 2000, Andrey Savochkin wrote:
> > > Sorry, I don't see the point of any modifications for inet_select_addr.
> > > IN_DEV_HIDDEN is an absolute alien here!
> >
> > You can ask Alexey for this change. But OK, I will
> > answer you. It is there just to avoid ARP requests "who-has
> > THIS_HIDDEN_IP tell ME" in the future. If you autoselect
> > shared IP address to communicate with other host on the LAN
> > after your first outgoing packet you will see broadcast ARP
> > request "who-has THIS_HIDDEN_IP tell remote_end". In my
> > example when the ARP replies are filtered this request will
> > be answered from the Director only (where they are not
> > filtered). So, as a result the real server (where the IP
> > address is hidden because it is shared) _MUST_ not
> > autoselect such shared addresses to communicate with the
> > other hosts on the LAN. The other end will not talk with
>
> Fine. So do not CONFIGURE your system to do it.
> It's all up to you to do right things!
> Your explanations are not an excuse for a bad code in the kernel.

        OK
 
> [snip]
> > I agree. This is one big hack. But I don't know
> > easier way to support shared addresses. May be other people
> > have more ideas? If this feature raises problems the user
> > can decide not use it. I don't see problem here.
>
> The problem is that it adds ugly checks in random places of the kernel,
> messes the clear notions of address, device, route, and violates layering
> structure. And it's not a pure theoretic criticism. Your solution cannot
> coexist with shared media (multiple IP network over a single link).

        OK
 
> [snip]
> > Something near. What about:
> >
> > - if (skb && inet_addr_type(skb->nh.iph->saddr) == RTN_LOCAL)
> > + if (skb && (in_dev != NULL && !IN_DEV_DEFAULT_ARP_SRC(in_dev)) &&
> > + inet_addr_type(skb->nh.iph->saddr) == RTN_LOCAL)
> > saddr = skb->nh.iph->saddr;
> > else
> > saddr = inet_select_addr(dev, target, RT_SCOPE_LINK);
> > + if (in_dev) in_dev_put(in_dev);
>
> Fine for me.
>
> >
> > The code can be optimized. Can we replace
> > IN_DEV_DEFAULT_ARP_SRC with IN_DEV_ARPFILTER?
>
> No, we cannot. IN_DEV_DEFAULT_ARP_SRC and IN_DEV_ARPFILTER are two different
> features. Why are you always trying to mix things up? :-)

        Well, I'm tryng to introduce some ideas by giving
examples.
 
> > But don't kill the alien in inet_select_addr because
> > if some protocol selects address with inet_select_addr it is
> > possible blocked local IP addresses to be selected. This is
>
> It may happen as a result of a broken configuration.

        But it can happen when the configuration is changed.
OK, may be it is not fatal (for me).

> We don't really need a bandaid in the kernel for such situations.
> That's my routing table:
>
> 203.120.9.96/27 dev eth0 scope link src 203.120.9.98
> default via 203.120.9.97 dev eth0 src 203.120.9.98 advmss 1326
> local 203.120.9.98 dev eth0 table local scope host src 203.120.9.98
> local 127.0.0.0/8 dev lo table local scope host src 127.0.0.1
>
> You may see that I don't have any route without pref_src set.
> So inet_select_addr() isn't used for source IP selection at all.

        OK, I can't give examples here. May be Alexey can.

        Anyway, if the above ip rules are correct I have to
investigate the possible problems with this configuration.
For now I don't see big problems except some restrictions
and difficulties with the configuration. By this way we
break the ability some clients to be on the LAN. Now the LVS
cluster can't be build from hosts on the LAN only. For
example, we can't balance .cgi clients on the LAN using a
cluster from database servers. Very bad!!! It seems such
good protocol as the ARP can't be used and we have to use
static ARP entries. Very bad!

        But Alexey will be very happy without the hidden
feature in 2.3+ :)

Regards

--
Julian Anastasov <uli@linux.tu-varna.acad.bg>

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon May 15 2000 - 21:00:12 EST