PROBLEM: Old content of /proc/net after switching network namespace

From: Mateusz StÄpieÅ
Date: Fri Jan 04 2019 - 03:21:35 EST


Hello everyone,

After changing network namespace using setns, the content of /proc/net still represents the original namespace.
It looks like procfs dentries are not invalidated in dcache properly after the namespace switch.
It happens only, when you read content of /proc/net before changing namespace
The problem is reproducible in 4.19.13 but not in 4.14.X.
Bisecting the stable kernel tree shows that the commit
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1da4d377f943fe4194ffb9fb9c26cc58fad4dd24 introduced the problem.
Reverting mentioned commit resolves it.

MCVE (slightly modified example from [man 2 setns]):

#define _GNU_SOURCE
#include <fcntl.h>
#include <sched.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \
} while (0)

void print_dev()
{
int fd2;
fd2 = open("/proc/net/dev", O_RDONLY);
char buf[2048] = {0};
read(fd2, buf, 2048);
printf("%s", buf);
close(fd2);
}
int
main(int argc, char *argv[])
{
int fd;

printf("before namespace switch =========\n");
print_dev();
fd = open(argv[1], O_RDONLY); /* Get file descriptor for namespace */
if (fd == -1)
errExit("open");

if (setns(fd, 0) == -1) /* Join that namespace */
errExit("setns");

printf("after namespace switch ++++++++++\n");
print_dev();
return 0;
}


Steps to reproduce (assuming we have an interface named enp0s9):

ip netns add test
ip link set dev enp0s9 netns test
ip netns exec test sleep 30 &
gcc -o mcve mcve.c # mcve.c contains above C code
./mcve /proc/$(pidof sleep)/ns/net
# before namespace switch =========
# Inter-| Receive | Transmit
# face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed
# lo: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# br0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# enp0s3: 149625 1117 0 0 0 0 0 1 61664 485 0 0 0 0 0 0
# docker0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# enp0s8: 17086 60 0 0 0 0 0 29 1006 13 0 0 0 0 0 0
# after namespace switch ++++++++++
# Inter-| Receive | Transmit
# face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed
# lo: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# br0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# enp0s3: 150813 1135 0 0 0 0 0 1 64348 503 0 0 0 0 0 0
# docker0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# enp0s8: 17086 60 0 0 0 0 0 29 1006 13 0 0 0 0 0 0
ip netns exec test cat /proc/net/dev
## Should display
# Inter-| Receive | Transmit
# face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed
# lo: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# enp0s9: 10438 33 0 0 0 0 0 17 936 12 0 0 0 0 0 0

output from awk -f scripts/ver_linux

Linux test-agent 4.19.13 #1 SMP PREEMPT Thu Jan 3 12:03:20 UTC 2019 x86_64 GNU/Linux

Util-linux 2.29.2
Mount 2.29.2
Module-init-tools 23
E2fsprogs 1.43.4
Linux C Library 2.24
Dynamic linker (ldd) 2.24
Linux C++ Library 6.0.22
Procps 3.3.12
Net-tools 2.10
Sh-utils 8.26
Udev 232
Modules Loaded ahci ata_generic ata_piix crc32c_intel e1000 ehci_hcd ehci_pci i2c_core i2c_piix4 libahci serio_raw usb_common usbcore