Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs

From: Alexei Starovoitov
Date: Mon Oct 19 2015 - 21:09:19 EST


On 10/19/15 4:02 PM, Hannes Frederic Sowa wrote:
I bet commercial software will make use of this ebpf framework, too. And
the kernel always helped me and gave me a way to see what is going on,
debug which part of my operating system universe interacts with which
other part. Merely dropping file descriptors with data attached to them
in an filesystem seems not to fulfill my need at all. I would love to
see where resources are referenced and why, like I am nowadays.

agree. common fs with hierarchy will give this visibility in
one place.

>It feels you're pushing for cdev only because of that potential
>debugging need. Did you actually face that need? I didn't and
>don't like to add 'nice to have' feature until real need comes.
Given that we want to monitor the load of a hashmap for graphing
purposes. Or liberate some hashmaps from its restriction on number of
keys and make upper bounds configurable by admins who know the
dimensions of their systems and not some software deep down buried in
the bpf syscall where I might not have access to source code. In tc
force e.g. hashmaps to do garbage collection because we cannot be sure
that under DoS attacks user space clean up gets scheduled early enough
if ebpf adds flows to hashtables. I do see need to expand and implement
some kind of policy in the future.

disagree here. admin should not interfere with map parameters.
What you proposing above sounds very very dangerous.
Admins to configure GC of maps? What do you think the programs will do
with such sophisticated maps? What kind of networking app you have
in mind? Anyway that's a bit off-topic. I'm very curious though.

>single task in seccomp can have a chain of bpf progs, so hierarchy
>is already there.
And it would be great to inspect them.

again let's not mix criu and lsof-like requirements with 'pin fd'.
For visibility of normal maps we can add fdinfo and lsof
can pick it up without any fs or any cdevs.

I am fine with creating maps only by bpf syscall. But to hide
configuration details or at least not be really able to query them
easily seems odd to me. If we go with the ebpffs how could those
attributes be added?

I'm not advocating to hide details. Most of the time maps will not be
pinned, so fdinfo seems the easiest way to show things like key_size,
value_size, max_entries, type.
Even if we decide to do it some other way, it's not related to 'pin fd'
discussion, since debugging/visibility is nice to have for all bpf objects. Note that walking of key/value without pretty-printers
provided by the app is meaningless for admin, so only things
like 'how much memory this map is using' are useful.

May be we should try to draft the hierarchy of this common fs.
How about:
/sys/kernel/bpf/username/optional_dirs_mkdir_by_user/progX
and 'cat' of it will print the same as fdinfo for normal maps,
so admin can see what maps were pinned by user and its cost.

Inside 'fdinfo' output we can provide pointers to which progs
are using which maps as
# cat /sys/kernel/bpf/.../mapX
key_size: 4
used_by: /proc/xxx/fd/5
# cat /sys/kernel/bpf/.../progY
type: socket
using: /proc/xxx/fd/6
using: /sys/kernel/bpf/.../mapZ
and similar for cat /proc/xxx/fdinfo/6
but showing hierarchy as directories is non starter, since
it's no a tree.

All of these would be nice, but doesn't have to be implemented
along with 'pin fd' feature.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/