Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs

From: Hannes Frederic Sowa
Date: Mon Oct 19 2015 - 03:36:18 EST


Hi,

On Sun, Oct 18, 2015, at 22:59, Alexei Starovoitov wrote:
> On 10/18/15 9:49 AM, Daniel Borkmann wrote:
> > Okay, I have pushed some rough working proof of concept here:
> >
> > https://git.breakpoint.cc/cgit/dborkman/net-next.git/log/?h=ebpf-fds-final5
> >
> > So the idea eventually had to be slightly modified after giving this
> > further
> > thoughts and is the following:
> >
> > We have 3 commands (BPF_DEV_CREATE, BPF_DEV_DESTROY, BPF_DEV_CONNECT), and
> > related to that a bpf_attr extension with only a single __u32 fd member
> > in it.
> ...
> > The nice thing about it is that you can create/unlink as many as you
> > want, but
> > when you remove the real device from an application via
> > bpf_dev_destroy(fd),
> > then all links disappear with it. Just like in the case of a normal
> > device driver.
>
> interesting idea!
> What happens if user app creates a dev via bpf_dev_create(), exits and
> then admin does rm of that dev ?
> Looks like map/prog will leak ?
> So the only proper way to delete such cdevs is via bpf_dev_destroy ?

The mknod is not the holder but rather the kobject which should be
represented in sysfs will be. So you can still get the map major:minor
by looking up the /dev file in the correspdonding sysfs directory or I
think we should provide a 'unbind' file, which will drop the kobject if
the user writes a '1' to it.

>
> > On device creation, the kernel will return the minor number via bpf(2),
> > so you
> > can access the file easily, f.e. /dev/bpf/bpf_map<minor> resp.
> > /dev/bpf/bpf_prog<minor>,
> > and then move on with mknod(2) or symlink(2) from there if wished.
>
> what if admin mknod in that dir with some arbitrary minor ?

Basically, -EIO. :)

> mknod will succeed, but it won't hold anything?

That is right now true for basically all mknod operations, which udev
creates.

> looks like bpf_dev_connect will handle it gracefully.
> So these cdevs should only be created and destroyed via bpf syscall
> and only sensible operations on them is open() to get fd and pass
> to bpf_dev_connect and symlink. Anything else admin should be
> careful not to do. Right?

Besides maybe some statistics and other stuff in sysfs directory, no,
that is all.

Bye,
Hannes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/