Re: [PATCH bpf-next v3 3/3] bpf, arm64: use bpf_jit_binary_pack_alloc

From: Mark Rutland
Date: Thu Jun 22 2023 - 04:23:56 EST


On Wed, Jun 21, 2023 at 10:57:20PM +0200, Puranjay Mohan wrote:
> On Wed, Jun 21, 2023 at 5:31 PM Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > On Mon, Jun 19, 2023 at 10:01:21AM +0000, Puranjay Mohan wrote:
> > > @@ -1562,34 +1610,39 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
> > >
> > > /* 3. Extra pass to validate JITed code. */
> > > if (validate_ctx(&ctx)) {
> > > - bpf_jit_binary_free(header);
> > > prog = orig_prog;
> > > - goto out_off;
> > > + goto out_free_hdr;
> > > }
> > >
> > > /* And we're done. */
> > > if (bpf_jit_enable > 1)
> > > bpf_jit_dump(prog->len, prog_size, 2, ctx.image);
> > >
> > > - bpf_flush_icache(header, ctx.image + ctx.idx);
> > > + bpf_flush_icache(ro_header, ctx.ro_image + ctx.idx);
> >
> > I think this is too early; we haven't copied the instructions into the
> > ro_header yet, so that still contains stale instructions.
> >
> > IIUC at the whole point of this is to pack multiple programs into shared ROX
> > pages, and so there can be an executable mapping of the RO page at this point,
> > and the CPU can fetch stale instructions throught that.
> >
> > Note that *regardless* of whether there is an executeable mapping at this point
> > (and even if no executable mapping exists until after the copy), we at least
> > need a data cache clean to the PoU *after* the copy (so fetches don't get a
> > stale value from the PoU), and the I-cache maintenance has to happeon the VA
> > the instrutions will be executed from (or VIPT I-caches can still contain stale
> > instructions).
>
> Thanks for catching this, It is a big miss from my side.
>
> I was able to reproduce the boot issue in the other thread on my
> raspberry pi. I think it is connected to the
> wrong I-cache handling done by me.
>
> As you rightly pointed out: We need to do bpf_flush_icache() after
> copying the instructions to the ro_header or the CPU can run
> incorrect instructions.
>
> When I move the call to bpf_flush_icache() after
> bpf_jit_binary_pack_finalize() (this does the copy to ro_header), the
> boot issue
> is fixed. Would this change be enough to make this work or I would
> need to do more with the data cache as well to catch other
> edge cases?

AFAICT, bpf_flush_icache() calls flush_icache_range(). Despite its name,
flush_icache_range() has d-cache maintenance, i-cache maintenance, and context
synchronization (i.e. it does everything necessary).

As long as you call that with the VAs the code will be executed from, that
should be sufficient, and you don't need to do any other work.

Thanks,
Mark.