Re: [PATCH bpf-next v2] bpf: Upgrade bpf_{g,s}etsockopt return values

From: Martin KaFai Lau
Date: Wed Dec 07 2022 - 16:08:51 EST


On 12/7/22 3:19 AM, Ji Rongfeng wrote:
On 2022/12/7 2:36, Martin KaFai Lau wrote:
On 12/2/22 9:39 AM, Ji Rongfeng wrote:
Returning -EINVAL almost all the time when error occurs is not very
helpful for the bpf prog to figure out what is wrong. This patch
upgrades some return values so that they will be much more helpful.

* return -ENOPROTOOPT when optname is unsupported

   The same as {g,s}etsockopt() syscall does. Before this patch,
   bpf_setsockopt(TCP_SAVED_SYN) already returns -ENOPROTOOPT, which
   may confuse the user, as -EINVAL is returned on other unsupported
   optnames. This patch also rejects TCP_SAVED_SYN right in
   sol_tcp_sockopt() when getopt is false, since do_tcp_setsockopt()
   is just the executor and it's not its duty to discover such error
   in bpf. We should maintain a precise allowlist to control whether
   an optname is supported and allowed to enter the executor or not.
   Functions like do_tcp_setsockopt(), their behaviour are not fully
   controllable by bpf. Imagine we let an optname pass, expecting
   -ENOPROTOOPT will be returned, but someday that optname is
   actually processed and unfortunately causes deadlock when calling
   from bpf. Thus, precise access control is essential.

Please leave the current -EINVAL to distinguish between optnames rejected by bpf and optnames rejected by the do_*_{get,set}sockopt().

To reach that goal, it would be better for us to pick a value other than -ENOPROTOOPT or -EINVAL. This patch actually makes sk-related errors, level-reletad errors, optname-related errors and opt{val,len}-related errors distinguishable, as they should be, by leaving -EINVAL to opt{val,len}-related errors only. man setsockopt:

> EINVAL optlen invalid in setsockopt().  In some cases this error
>        can also occur for an invalid value in optval (e.g., for
>        the IP_ADD_MEMBERSHIP option described in ip(7)).

With an unique return value, the bpf prog developer will be able to know that the error is "unsupported or unknown optname" for sure, saving time on figuring the actual cause of the error. In production environment, the bpf prog will be able to test whether an optname is available in current bpf env and decide what to do next also, which is very useful.

It should be a non-goal for bpf_{set,get}sockopt to provide this level of granularity (optlevel vs optname vs optlen) at this point when it requires this kind of churns on the existing return values. It is not like there is a lot of optname that supports non integer optlen. This should not be a limit factor if the userspace wants to probe what optname is supported. Hence, not convinced it is needed. I would like to hear how others think.

If optname is not something that bpf_{set,get}sockopt rejects, it should then rely on the do_*_{get,set}sockopt for handling which includes the error value.

For TCP_SAVED_SYN, tbh, it is too hypothetical that the do_tcp_setsockopt will have a reasonable use case to ever support it. If it did, we would make it available to the bpf also because the bpf_getsockopt(TCP_SAVED_SYN) has already been supported instead of now worrying about blacklisting it for the future.

-EBADFD will be useful because the same level, optname, optval, and optlen will fail because of different sk.