Re: [PATCH bpf-next v3 1/3] bpf, x86: allow function arguments up to 12 for TRACING

From: Yonghong Song
Date: Fri Jun 09 2023 - 00:30:17 EST




On 6/8/23 7:12 PM, Menglong Dong wrote:
On Fri, Jun 9, 2023 at 5:07 AM Yonghong Song <yhs@xxxxxxxx> wrote:



On 6/7/23 5:59 AM, menglong8.dong@xxxxxxxxx wrote:
From: Menglong Dong <imagedong@xxxxxxxxxxx>

For now, the BPF program of type BPF_PROG_TYPE_TRACING can only be used
on the kernel functions whose arguments count less than 6. This is not
friendly at all, as too many functions have arguments count more than 6.

Since you already have some statistics, maybe listed in the commit message.


Therefore, let's enhance it by increasing the function arguments count
allowed in arch_prepare_bpf_trampoline(), for now, only x86_64.

For the case that we don't need to call origin function, which means
without BPF_TRAMP_F_CALL_ORIG, we need only copy the function arguments
that stored in the frame of the caller to current frame. The arguments
of arg6-argN are stored in "$rbp + 0x18", we need copy them to
"$rbp - regs_off + (6 * 8)".

Maybe I missed something, could you explain why it is '$rbp + 0x18'?

In the current upstream code, we have

/* Generated trampoline stack layout:
*
* RBP + 8 [ return address ]
* RBP + 0 [ RBP ]
*
* RBP - 8 [ return value ] BPF_TRAMP_F_CALL_ORIG or
*
BPF_TRAMP_F_RET_FENTRY_RET flags
*
* [ reg_argN ] always
* [ ... ]
* RBP - regs_off [ reg_arg1 ] program's ctx pointer
*
* RBP - nregs_off [ regs count ] always
*
* RBP - ip_off [ traced function ] BPF_TRAMP_F_IP_ARG flag
*
* RBP - run_ctx_off [ bpf_tramp_run_ctx ]
*/

Next on-stack argument will be RBP + 16, right?


Sorry for the confusing, it seems there should be
some comments here.

It's not the next on-stack argument, but the next next on-stack
argument. The call chain is:

caller -> origin call -> trampoline

So, we have to skip the "RIP" in the stack frame of "origin call",
which means RBP + 16 + 8. To be clear, there are only 8-byte
in the stack frame of "origin call".

Thanks. It does make sense now. So we have
caller -> origin call -> (5 nops changed to a call) -> trampoline
8 bytes 8 bytes
and inside trampoline we have 8 bytes in stack with 'push rbp'.
Yes, it would be great there is an explanation in the code.


Thanks!
Menglong Dong



For the case with BPF_TRAMP_F_CALL_ORIG, we need prepare the arguments
in stack before call origin function, which means we need alloc extra
"8 * (arg_count - 6)" memory in the top of the stack. Note, there should
not be any data be pushed to the stack before call the origin function.
Then, we have to store rbx with 'mov' instead of 'push'.

We use EMIT3_off32() or EMIT4() for "lea" and "sub". The range of the
imm in "lea" and "sub" is [-128, 127] if EMIT4() is used. Therefore,
we use EMIT3_off32() instead if the imm out of the range.

It works well for the FENTRY and FEXIT, I'm not sure if there are other
complicated cases.

MODIFY_RETURN is also impacted by this patch.


Reviewed-by: Jiang Biao <benbjiang@xxxxxxxxxxx>
Signed-off-by: Menglong Dong <imagedong@xxxxxxxxxxx>
[...]