Re: [PATCH 07/32] mm: Bring back vmalloc_exec

From: Kent Overstreet
Date: Wed May 17 2023 - 01:28:58 EST


On Tue, May 16, 2023 at 10:47:13PM +0100, Matthew Wilcox wrote:
> On Tue, May 16, 2023 at 05:20:33PM -0400, Kent Overstreet wrote:
> > On Tue, May 16, 2023 at 02:02:11PM -0700, Kees Cook wrote:
> > > For something that small, why not use the text_poke API?
> >
> > This looks like it's meant for patching existing kernel text, which
> > isn't what I want - I'm generating new functions on the fly, one per
> > btree node.
> >
> > I'm working up a new allocator - a (very simple) slab allocator where
> > you pass a buffer, and it gives you a copy of that buffer mapped
> > executable, but not writeable.
> >
> > It looks like we'll be able to convert bpf, kprobes, and ftrace
> > trampolines to it; it'll consolidate a fair amount of code (particularly
> > in bpf), and they won't have to burn a full page per allocation anymore.
> >
> > bpf has a neat trick where it maps the same page in two different
> > locations, one is the executable location and the other is the writeable
> > location - I'm stealing that.
>
> How does that avoid the problem of being able to construct an arbitrary
> gadget that somebody else will then execute? IOW, what bpf has done
> seems like it's working around & undoing the security improvements.
>
> I suppose it's an improvement that only the executable address is
> passed back to the caller, and not the writable address.

Ok, here's what I came up with. Have not tested all corner cases, still
need to write docs - but I think this gives us a nicer interface than
what bpf/kprobes/etc. have been doing, and it does the sub-page sized
allocations I need.

With an additional tweak to module_alloc() (not done in this patch yet)
we avoid ever mapping in pages both writeable and executable:

-->--