Re: [PATCH 07/32] mm: Bring back vmalloc_exec

From: Andy Lutomirski
Date: Tue Jun 20 2023 - 13:42:32 EST


On Mon, Jun 19, 2023, at 12:17 PM, Kent Overstreet wrote:
> On Mon, Jun 19, 2023 at 01:47:18PM +0100, Mark Rutland wrote:
>> Sorry, but I do have an engineering rationale here: I want to make sure that
>> this actually works, on architectures that I care about, and will be
>> maintanable long-term.
>>
>> We've had a bunch of problems with other JITs ranging from JIT-local "we got
>> the encoding wrong" to major kernel infrastructure changes like tasks RCU rude
>> synchronization. I'm trying to figure out whether any of those are likely to
>> apply and/or whether we should be refactoring other infrastructure for use here
>> (e.g. the factoring the acutal instruction generation from arch code, or
>> perhaps reusing eBPF so this can be arch-neutral).
>>
>> I appreciate that's not clear from my initial mail, but please don't jump
>> straight to assuming I'm adversarial here.
>
> I know you're not trying to be adversarial, but vague negative feedback
> _is_ hostile, because productive technical discussions can't happen
> without specifics and you're putting all the onus on the other person to
> make that happen.

I'm sorry, but this isn't how correct code gets written, and this isn't how at least x86 maintenance operates.

Code is either correct, and comes with an explanation as to how it is correct, or it doesn't go in. Saying that something is like BPF is not an explanation as to how it's correct. Saying that someone has not come up with the chain of events that causes a mere violation of architecture rules to actual incorrect execution is not an explanation as to how something is correct.

So, without intending any particular hostility:

<puts on maintainer hat>

bcachefs's x86 JIT is:
Nacked-by: Andy Lutomirski <luto@xxxxxxxxxx> # for x86

<takes off maintainer hat>

This makes me sad, because I like bcachefs. But you can get it merged without worrying about my NAK by removing the x86 part.

>
> When you're raising an issue, try be specific - don't make people dig.
> If you're unable to be specific, perhaps you're not the right person to
> be raising the issue.
>
> I'm of course happy to answer questions that haven't already been asked.
>
> This code is pretty simple as JITs go. With the existing, vmalloc_exec()
> based code, there aren't any fancy secondary mappings going on, so no
> crazy cache coherency games, and no crazy syncronization issues to worry
> about: the jit functions are protected by the per-btree-node locks.
>
> vmalloc_exec() isn't being upstreamed however, since people don't want
> WX mappings.
>
> The infrastructure changes we need (and not just for bcachefs) are
> - better executable memory allocation API, with support for sub-page
> allocations: this is already being worked on, the prototype slab
> allocator I posted is probably going to be the basis for part of this
>
> - an arch indepenendent version of text_poke(): we don't want user code
> to be flipping page permissions to update text, text_poke() is the
> proper API but it's x86 only. No one has volunteered for this yet.
>

text_poke() by itself is *not* the proper API, as discussed. It doesn't serialize adequately, even on x86. We have text_poke_sync() for that.

--Andy