Re: [syzbot] upstream boot error: WARNING in netlink_ack

From: Kees Cook
Date: Tue Oct 04 2022 - 19:41:37 EST


On Tue, Oct 04, 2022 at 10:42:53AM -0700, Jakub Kicinski wrote:
> On Tue, 04 Oct 2022 07:36:55 -0700 Kees Cook wrote:
> > This is fixed in the pending netdev tree coming for the merge window.
>
> This has been weighing on my conscience a little, I don't like how we
> still depend on putting one length in the skb and then using a
> different one for the actual memcpy(). How would you feel about this
> patch on top (untested):

tl;dr: yes, I like it. Please add a nlmsg_contents member. :)

Rambling below...

>
> diff --git a/include/net/netlink.h b/include/net/netlink.h
> index 4418b1981e31..6ad671441dff 100644
> --- a/include/net/netlink.h
> +++ b/include/net/netlink.h
> @@ -931,6 +931,29 @@ static inline struct nlmsghdr *nlmsg_put(struct sk_buff *skb, u32 portid, u32 se
> return __nlmsg_put(skb, portid, seq, type, payload, flags);
> }
>
> +/**
> + * nlmsg_append - Add more data to a nlmsg in a skb
> + * @skb: socket buffer to store message in
> + * @nlh: message header
> + * @payload: length of message payload
> + *
> + * Append data to an existing nlmsg, used when constructing a message
> + * with multiple fixed-format headers (which is rare).
> + * Returns NULL if the tailroom of the skb is insufficient to store
> + * the extra payload.
> + */
> +static inline void *nlmsg_append(struct sk_buff *skb, struct nlmsghdr *nlh,

nlh not needed here?

> + u32 size)
> +{
> + if (unlikely(skb_tailroom(skb) < NLMSG_ALIGN(size)))
> + return NULL;
> +
> + if (!__builtin_constant_p(size) || NLMSG_ALIGN(size) - size != 0)

why does a fixed size mean no memset?

> + memset(skb_tail_pointer(skb) + size, 0,
> + NLMSG_ALIGN(size) - size);
> + return __skb_put(NLMSG_ALIGN(size));
> +}
> +
> /**
> * nlmsg_put_answer - Add a new callback based netlink message to an skb
> * @skb: socket buffer to store message in
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index a662e8a5ff84..bb3d855d1f57 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -2488,19 +2488,28 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err,
> flags |= NLM_F_ACK_TLVS;
>
> skb = nlmsg_new(payload + tlvlen, GFP_KERNEL);
> - if (!skb) {
> - NETLINK_CB(in_skb).sk->sk_err = ENOBUFS;
> - sk_error_report(NETLINK_CB(in_skb).sk);
> - return;
> - }
> + if (!skb)
> + goto err_bad_put;
>
> rep = nlmsg_put(skb, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq,
> - NLMSG_ERROR, payload, flags);
> + NLMSG_ERROR, sizeof(*errmsg), flags);
> + if (!rep)
> + goto err_bad_put;
> errmsg = nlmsg_data(rep);
> errmsg->error = err;
> - unsafe_memcpy(&errmsg->msg, nlh, payload > sizeof(*errmsg)
> - ? nlh->nlmsg_len : sizeof(*nlh),
> - /* Bounds checked by the skb layer. */);
> + memcpy(&errmsg->msg, nlh, sizeof(*nlh));
> +
> + if (!(flags & NLM_F_CAPPED)) {

Should it test this flag, or test if the sizes show the need for "extra"
payload length?

I always found the progression of sizes here to be confusing. "payload"
starts as sizeof(*errmsg), and gets nlmsg_len(nlh) added but only when if
"(err && !(nlk->flags & NETLINK_F_CAP_ACK)" was true. Why is
nlmsg_len(nlh) _wrong_ if the rest of its contents are correct? If this
was "0" in the other state, the logic would just be:

nlh_bytes = nlmsg_len(nlh);
total = sizeof(*errmsg);
total += nlh_bytes;
total += tlvlen;

and:

nlmsg_new(total, ...);
... nlmsg_put(..., sizeof(*errmsg), ...);
...
errmsg->error = err;
errmsg->nlh = *nlh;
if (nlh_bytes) {
data = nlmsg_append(..., nlh_bytes), ...);
...
memcpy(data, nlh->nlmsg_contents, nlh_bytes);
}

> + size_t data_len = nlh->nlmsg_len - sizeof(*nlh);

I think data_len here is also "payload - sizeof(*errmsg)"? So if it's >0,
we need to append the nlh contents.

> + void *data;
> +
> + data = nlmsg_append(skb, rep, data_len);
> + if (!data)
> + goto err_bad_put;
> +
> + /* the nlh + 1 is probably going to make you unhappy? */

Right, the compiler may think it is an object no larger than sizeof(*nlh).
My earliest attempt at changes here introduced a flex-array for the
contents, and split the memcpy:
https://lore.kernel.org/lkml/d7251d92-150b-5346-6237-52afc154bb00@xxxxxxxxxxxxxxxxxx/
which is basically the solution you have here, except it wasn't having
the nlmsg_*-helpers do the bounds checking.

> + memcpy(data, nlh + 1, data_len);

So with the struct nlmsghdr::nlmsg_contents member, this becomes:

memcpy(data, nlh->nlmsg_contents, data_len);

--
Kees Cook