[BUG] watch_queue resource accounting seems broken

From: Ondrej Mosnacek
Date: Thu Aug 04 2022 - 04:41:09 EST


Hi,

It seems there is something wrong with resource accounting for
watch_queues. When a watch_queue is created, its size is set, and then
both ends are closed, it seems the resource usage increment is not
released as it should be and repeated creations of watch_queues
eventually (and quite fast!) exhaust the per-user pipe limit. I tested
this only on kernels 5.19 and 5.17.5, but I suspect the bug has been
there since the watch_queue introduction.

The issue can be reproduced by the attached C program. When it is run
by an unprivileged user (or by root with cap_sys_admin and
cap_sys_resource dropped), the pipe allocation/size setting starts to
fail after a few iterations.

I found this bug thanks to selinux-testuite's [1] watchkey test, which
started repeatably failing after I ran it a couple times in a row.

I'm not very familiar with this code area, so I'm hoping that someone
who understands the inner workings of watch_queue will be able and
willing to look into it and fix it.

Thanks,

[1] https://github.com/SELinuxProject/selinux-testsuite/

--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <sys/ioctl.h>
#include <linux/watch_queue.h>

#define BUF_SIZE 256

int main(int argc, char **argv)
{
int i, pipefd[2], result;

for (i = 0; i < 1000; i++) {
fprintf(stderr, "%d\n", i);
result = pipe2(pipefd, O_NOTIFICATION_PIPE);
if (result < 0) {
fprintf(stderr, "Failed to create pipe2(2): %s\n",
strerror(errno));
return errno;
}

result = ioctl(pipefd[0], IOC_WATCH_QUEUE_SET_SIZE, BUF_SIZE);
if (result < 0) {
fprintf(stderr, "Failed to set watch_queue size: %s\n",
strerror(errno));
return errno;
}

close(pipefd[0]);
close(pipefd[1]);
}
return 0;
}