[RFC PATCH 0/4] namespacefs: Proof-of-Concept

From: Yordan Karadzhov (VMware)
Date: Thu Nov 18 2021 - 13:13:13 EST


We introduce a simple read-only virtual filesystem that provides
direct mechanism for examining the existing hierarchy of namespaces
on the system. For the purposes of this PoC, we tried to keep the
implementation of the pseudo filesystem as simple as possible. Only
two namespace types (PID and UTS) are coupled to it for the moment.
Nevertheless, we do not expect having significant problems when
adding all other namespace types.

When fully functional, 'namespacefs' will allow the user to see all
namespaces that are active on the system and to easily retrieve the
specific data, managed by each namespace. For example the PIDs of
all tasks enclosed in the individual PID namespaces. Any existing
namespace on the system will be represented by its corresponding
directory in namespacesfs. When a namespace is created a directory
will be added. When a namespace is destroyed, its corresponding
directory will be removed. The hierarchy of the directories will
follow the hierarchy of the namespaces.

One may argue that most of the information, being exposed by this
new filesystem is already provided by 'procfs' in /proc/*/ns/. In
fact, 'namespacefs' aims to be complementary to 'procfs', showing not
only the individual connections between a process and its namespaces,
but also the global hierarchy of these connections. As a usage example,
before playing with 'namespacefs', I had no idea that the Chrome web
browser creates a number of nested PID namespaces. I can only guess
that each tab or each site is isolated in a nested namespace.

Being able to see the structure of the namespaces can be very useful
in the context of the containerized workloads. This will provide
universal methods for detecting, examining and monitoring all sorts
of containers running on the system, without relaying on any specific
user-space software. Fore example, with the help of 'namespacefs',
the simple Python script below can discover all containers, created
by 'Docker' and Podman' (by all user) that are currently running on
the system.


import sys
import os
import pwd

path = '/sys/fs/namespaces'

def pid_ns_tasks(inum):
tasks_file = '{0}/pid/{1}/tasks'.format(path ,inum)
with open(tasks_file) as f:
return [int(pid) for pid in f]

def uts_ns_inum(pid):
uts_ns_file = '/proc/{0}/ns/uts'.format(pid)
uts_ns = os.readlink(uts_ns_file)
return uts_ns.split('[')[1].split(']')[0]

def container_info(pid_inum):
pids = pid_ns_tasks(inum)
name = ''
uid = -1

if len(pids):
uts_inum = uts_ns_inum(pids[0])
uname_file = '{0}/uts/{1}/uname'.format(path, uts_inum)
if os.path.exists(uname_file):
stat_info = os.stat(uname_file)
uid = stat_info.st_uid
with open(uname_file) as f:
name = f.read().split()[1]

return name, pids, uid

if __name__ == "__main__":
pid_ns_list = os.listdir('{0}/pid'.format(path))
for inum in pid_ns_list:
name, pids, uid = container_info(inum)
if (name):
user = pwd.getpwuid(uid).pw_name
print("{0} -> pids: {1} user: {2}".format(name, pids, user))



The idea for 'namespacefs' is inspired by the discussion of the
'Container tracing' topic [1] during the 'Tracing micro-conference' [2]
at LPC 2021.

1. https://www.youtube.com/watch?v=09bVK3f0MPg&t=5455s
2. https://www.linuxplumbersconf.org/event/11/page/104-accepted-microconferences


Yordan Karadzhov (VMware) (4):
namespacefs: Introduce 'namespacefs'
namespacefs: Add methods to create/remove PID namespace directories
namespacefs: Couple namespacefs to the PID namespace
namespacefs: Couple namespacefs to the UTS namespace

fs/Kconfig | 1 +
fs/Makefile | 1 +
fs/namespacefs/Kconfig | 6 +
fs/namespacefs/Makefile | 4 +
fs/namespacefs/inode.c | 410 ++++++++++++++++++++++++++++++++++++
include/linux/namespacefs.h | 73 +++++++
include/linux/ns_common.h | 4 +
include/uapi/linux/magic.h | 2 +
kernel/pid_namespace.c | 9 +
kernel/utsname.c | 9 +
10 files changed, 519 insertions(+)
create mode 100644 fs/namespacefs/Kconfig
create mode 100644 fs/namespacefs/Makefile
create mode 100644 fs/namespacefs/inode.c
create mode 100644 include/linux/namespacefs.h

--
2.33.1