Re: [RFC PATCH] audit: Send netlink ACK before setting connection in auditd_set

From: Chris Riches
Date: Tue Oct 17 2023 - 09:50:02 EST


On 16/10/2023 21:16, Paul Moore wrote:

> Thanks for trimming the email in your reply, however, it is helpful to
> preserve those "On Mon, Oct ..." headers for those emails which you
> include in your reply, it helps keep things straight when reading the
> email.  Not a big deal, just something to keep in mind for next time.

Thanks for the pointer - I'm new to these mailing lists so appreciate
the advice.


> I should have been more clear, that's what just a quick hack that I
> cut-n-pasted into the email body, whitespace damage was a given.
> Typically if I include a patch with the qualification that it is
> untested, you can expect problems :) but I'll try to make the pitfalls
> more explicit in the future.

Gotcha.


>> While typing it out manually, I noticed that
>> the condition for sending the ACK isn't correct - if NLM_F_ACK is 0 to
>> begin with, then ack will be false to begin with, and so no ACK will be
>> sent even if there is an error.
>
> Good point.  I'll just casually remind you that I did say "untested" ;)
>
> I believe the following should work as intended (untested, cut-n-paste, etc.):
> .....

I think ack must be set to NLM_F_ACK initially - otherwise auditd_set
will always send the fast-tracked ACK even if the caller did not
request one. The following is a concrete version of what I roughly
suggested in the last email - is there a specific problem you see with
the (ack || err) condition?

@@ -1538,9 +1551,10 @@ static int audit_receive_msg(struct sk_buff *skb, struct
nlmsghdr *nlh)
 * Parse the provided skb and deal with any messages that may be present,
 * malformed skbs are discarded.
 */
-static void audit_receive(struct sk_buff  *skb)
+static void audit_receive(struct sk_buff *skb)
 {
     struct nlmsghdr *nlh;
+    bool ack;
     /*
      * len MUST be signed for nlmsg_next to be able to dec it below 0
      * if the nlmsg_len was not aligned
@@ -1553,9 +1567,13 @@ static void audit_receive(struct sk_buff *skb)

     audit_ctl_lock();
     while (nlmsg_ok(nlh, len)) {
-        err = audit_receive_msg(skb, nlh);
-        /* if err or if this message says it wants a response */
-        if (err || (nlh->nlmsg_flags & NLM_F_ACK))
+        ack = nlh->nlmsg_flags & NLM_F_ACK;
+        err = audit_receive_msg(skb, nlh, &ack);
+
+        /* Send an ack if @ack is still true after audit_receive_msg
+         * potentially cleared it, or if there was an error. */
+        if (ack || err)
             netlink_ack(skb, nlh, err, NULL);


> I'm not sure I can recall everything from when I was thinking about
> this previously (that was about a week ago), but my quick thoughts
> right now are that you would need a lot more information and/or
> handshakes between the kernel and the daemon.
>
> Unfortunately, both the current audit design and implementation is
> seriously flawed in a number of areas.  One of these areas is the fact
> that data and control messages are sent using the same data flow.

Makes sense. The question of why there isn't a separate control socket
was one of the first we asked while looking into this.


> The issue isn't so much about the queues overflowing inside the
> kernel, it's about being able to schedule the audit daemon and/or
> kernel thread to service the flood of connection
> disconnects/reconnects coming from the reproducer.

Right, makes sense.


> The old audit mailing list, where the userspace development is still
> discussed, can be found here:
> ...

Thanks. I'll post there too.


- Chris