Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well

From: Daniel Borkmann
Date: Fri Sep 11 2015 - 14:28:43 EST


On 09/11/2015 07:33 PM, Tycho Andersen wrote:
On Fri, Sep 11, 2015 at 06:03:59PM +0200, Daniel Borkmann wrote:
On 09/11/2015 04:44 PM, Tycho Andersen wrote:
On Fri, Sep 11, 2015 at 03:02:36PM +0200, Daniel Borkmann wrote:
On 09/11/2015 02:20 AM, Tycho Andersen wrote:
In the next patch, we're going to add a way to access the underlying
filters via bpf fds. This means that we need to ref-count both the
struct seccomp_filter objects and the struct bpf_prog objects separately,
in case a process dies but a filter is still referred to by another
process.

Additionally, we mark classic converted seccomp filters as seccomp eBPF
programs, since they are a subset of what is supported in seccomp eBPF.

Signed-off-by: Tycho Andersen <tycho.andersen@xxxxxxxxxxxxx>
CC: Kees Cook <keescook@xxxxxxxxxxxx>
CC: Will Drewry <wad@xxxxxxxxxxxx>
CC: Oleg Nesterov <oleg@xxxxxxxxxx>
CC: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
CC: Pavel Emelyanov <xemul@xxxxxxxxxxxxx>
CC: Serge E. Hallyn <serge.hallyn@xxxxxxxxxx>
CC: Alexei Starovoitov <ast@xxxxxxxxxx>
CC: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
---
kernel/seccomp.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 245df6b..afaeddf 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -378,6 +378,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
}

atomic_set(&sfilter->usage, 1);
+ atomic_set(&sfilter->prog->aux->refcnt, 1);
+ sfilter->prog->type = BPF_PROG_TYPE_SECCOMP;

So, if you do this, then this breaks the assumption of eBPF JITs
that, currently, all classic converted BPF programs always have a
prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()).

Currently, JITs make use of this information to determine whether
A and X mappings for such programs should or should not be cleared
in the prologue (s390 currently).

In the seccomp_prepare_filter() stage, we're already past that, so
it will not cause an issue, but we certainly would need to be very
careful in future, if bpf_prog_was_classic() is then used at a later
stage when we already have a generated bpf_prog somewhere, as then
this assumption will break.

The only reason we need to do this is to allow BPF_DUMP_PROG to work,
since we were restricting it to only allow dumping of seccomp
programs, since those don't have maps. Instead, perhaps we could allow
dumping of BPF_PROG_TYPE_SECCOMP and BPF_PROG_TYPE_UNSPEC?

There are possibilities that BPF_PROG_TYPE_UNSPEC is calling helpers
already today, at least in networking case, not seccomp. So, since
you want to export [classic -> eBPF] only for seccomp, put fds on them
and dump these via bpf(2), you could allow that (with a big comment
stating why it's safe), but mid-term we really need to sanitize all
this stuff properly as this is needed for other types, too.

Sorry, just to be clear, you're suggesting that the patch is ok modulo
a comment describing the jit issues?

I think due to the given insns restrictions on classic seccomp, this
could work for "most cases" (see below) for the time being until pointer
sanitation is resolved and that seccomp-only restriction from the dump
could be removed, BUT there's one more stone in the road which you still
need to take care of with this whole 'giving classic seccomp-BPF -> eBPF
transforms an fd, dumping and restoring that via bpf(2)' approach:

If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter,
and dump that via your bpf(2) interface based on the current patches, what
you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic
JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net:
add JIT support for loads from struct seccomp_data.").

So in that case, bpf_prepare_filter() will not call into bpf_migrate_filter()
as there's simply no need for it, because the classic code could already
be JITed there. I guess other archs where JIT support for eBPF in not yet
within near sight might sooner or later support this insn for their classic
JITs, too ...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/