Re: [PATCH v4 bpf 1/2] bpf: fix skb_do_redirect return values

From: Dan Carpenter
Date: Wed Jul 26 2023 - 12:53:43 EST


On Wed, Jul 26, 2023 at 09:10:20AM -0700, Yan Zhai wrote:
> On Wed, Jul 26, 2023 at 06:01:00PM +0300, Dan Carpenter wrote:
> > On Wed, Jul 26, 2023 at 07:14:56AM -0700, Yan Zhai wrote:
> > > On Wed, Jul 26, 2023 at 04:39:08PM +0300, Dan Carpenter wrote:
> > > > I'm not positive I understand the code in ip_finish_output2(). I think
> > > > instead of looking for LWTUNNEL_XMIT_DONE it should instead look for
> > > > != LWTUNNEL_XMIT_CONTINUE. It's unfortunate that NET_XMIT_DROP and
> > > > LWTUNNEL_XMIT_CONTINUE are the both 0x1. Why don't we just change that
> > > > instead?
> > > >
> > > I considered about changing lwt side logic. But it would bring larger
> > > impact since there are multiple types of encaps on this hook, not just
> > > bpf redirect. Changing bpf return values is a minimum change on the
> > > other hand. In addition, returning value of NET_RX_DROP and
> > > NET_XMIT_CN are the same, so if we don't do something in bpf redirect,
> > > there is no way to distinguish them later: the former is considered as
> > > an error, while "CN" is considered as non-error.
> >
> > Uh, NET_RX/XMIT_DROP values are 1. NET_XMIT_CN is 2.
> >
> > I'm not an expert but I think what happens is that we treat NET_XMIT_CN
> > as success so that it takes a while for the resend to happen.
> > Eventually the TCP layer will detect it as a dropped packet.
> >
> My eyes slipped lines. CN is 2. But the fact RX return value can be
> returned on a TX path still makes me feel unclean. Odds are low that
> we will have new statuses in future, it is a risk. I'd hope to contain
> these values only inside BPF redirect code as they are the reason why
> such rx values can show up there. Meanwhile, your argument do make
> good sense to me that the same problem may occur for other stuff. It
> is true. In fact, I just re-examined BPF-REROUTE path, it has the
> exact same issue by directly sending dst_output value back.
>
> So I would propose to do two things:
> 1. still convert BPF redirect ingress code to contain the propagation
> of mixed return. Return only TX side value instead, which is also what
> majority of those local senders are expecting. (I was wrong about
> positive values returned to sendmsg below btw, they are not).
>
> 2. change LWTUNNEL_XMIT_CONTINUE and check for this at xmit hook.
>

Sounds good!

regards,
dan carpenter