Re: [PATCH V1] audit: add warning that an old auditd may be starved out by a new auditd

From: Paul Moore
Date: Fri Sep 11 2015 - 14:56:44 EST


On Fri, Sep 11, 2015 at 6:21 AM, Richard Guy Briggs <rgb@xxxxxxxxxx> wrote:
> On 15/09/09, Paul Moore wrote:
>> On Monday, September 07, 2015 12:58:18 PM Richard Guy Briggs wrote:
>> > On 15/09/07, Richard Guy Briggs wrote:
>> > > Nothing prevents a new auditd starting up and replacing a valid
>> > > audit_pid when an old auditd is still running, effectively starving out
>> > > the old auditd since audit_pid no longer points to the old valid auditd.
>> > >
>> > > There isn't an easy way to detect if an old auditd is still running on
>> > > the existing audit_pid other than attempting to send a message to see if
>> > > it fails. If no message to auditd has been attempted since auditd died
>> > > unnaturally or got killed, audit_pid will still indicate it is alive.
>> > >
>> > > Signed-off-by: Richard Guy Briggs <rgb@xxxxxxxxxx>
>> >
>> > Ok, self-nack on this one for a couple of problems...
>> > netlink_getsockbyportid() is static to af_netlink.c and "pid" should be
>> > task_tgid_vnr(current). Otherwise, any opinions on this approach?
>> >
>> > > ---
>> > > Note: Would it be too bold to actually block the registration of a new
>> > > auditd if the netlink_getsockbyportid() call succeeded? Would other
>> > > checks be appropriate?
>>
>> Hmm. It seems like we should prevent the registration of a new auditd if we
>> already have an auditd instance connected, although as you say, that isn't the
>> easiest thing to do.
>
> I wanted to do that, but I feared it would carry some risk that an
> intentional act would be blocked. In that case, an intentional act
> could include explicitly killing the old auditd (or process registered
> as such).

Well, if you are running another instance of auditd you should have
the right level of access to kill an older, existing auditd instance
so I'm not too worried about preventing an intentional act. Also, if
we check an existing connection with some sort of heartbeat/ping
message we wouldn't be preventing it for more than a few times I
believe.

>> How painful would it be to return -EAGAIN to the new auditd while sending some
>> sort of keep-alive/ping/etc. message to the old daemon to check its status?
>
> Well, if it turns out that the only reason it ever fails is
> -ECONNREFUSED, then we just need to check with netlink_getsockbyportid()
> to see if it fails before accepting the new auditd.

I'm not sure if the netdev crowd would be interested in exporting
_getsockbyportid(); for the sake of discussion let's assume no changes
to the netlink layer, if we get stuck we can revisit this idea.

> If it is one of the others, can we put the new auditd task on a wait
> queue until we hear back one way or the other or just timeout on
> contacting the old auditd?

Well, what if we don't have anything queued for the old auditd?
Although I suppose if nothing else we could send a record indicating
that another auditd attempted to replace it ... if we can send it
great, drop the new request and be glad we audited it, if we can't
send it, reset the auditd tracking.

> I'll let Steve speak to dealing with -EAGAIN. auditlib already deals
> with -EAGAIN and -EINTR for some cases. I have a patch that added
> -ENOBUFS to those cases since I had seen some reports that -ENOBUFS had
> been returned in some cases (don't remember the circumstances).

--
paul moore
www.paul-moore.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/