Re: [PATCH v2] cgroup: align the comm length with TASK_COMM_LEN

From: Kassey Li
Date: Sun Sep 25 2022 - 22:24:17 EST




On 9/23/2022 11:00 PM, Steven Rostedt wrote:
On Fri, 23 Sep 2022 15:51:05 +0800
Kassey Li <quic_yingangl@xxxxxxxxxxx> wrote:

__string could get a dst string with length less than
TASK_COMM_LEN.

A task->comm may change that can cause out of bounds access
for the dst string buffer, e.g in the call trace of below:

Call trace:

dump_backtrace.cfi_jt+0x0/0x4
show_stack+0x14/0x1c
dump_stack+0xa0/0xd8
die_callback+0x248/0x24c
notify_die+0x7c/0xf8
die+0xac/0x290
die_kernel_fault+0x88/0x98
die_kernel_fault+0x0/0x98
do_page_fault+0xa0/0x544
do_mem_abort+0x60/0x10c
el1_da+0x1c/0xc4
trace_event_raw_event_cgroup_migrate+0x124/0x170
cgroup_attach_task+0x2e8/0x41c
__cgroup1_procs_write+0x114/0x1ec
cgroup1_tasks_write+0x10/0x18
cgroup_file_write+0xa4/0x208
kernfs_fop_write+0x1f0/0x2f4
__vfs_write+0x5c/0x200
vfs_write+0xe0/0x1a0
ksys_write+0x74/0xdc
__arm64_sys_write+0x18/0x20
el0_svc_common+0xc0/0x1a4
el0_svc_compat_handler+0x18/0x20
el0_svc_compat+0x8/0x2c

Change it as arrary with same length TASK_COMM_LEN,
This idea is from commit d1eb650ff413 ("tracepoint: Move signal sending
tracepoint to events/signal.h").

This does not make sense. What exactly is the bug here?
hi, Steven:
hope below info can give you idea on this , let me know if you need more info.

kernel log:
Unable to handle kernel write to read-only memory at virtual address ffffffbcf7450000

"SharedPreferenc" is task name/comm.

memory/ddr dump:

FFFFFFBCF744FFE0| 00090020 000B0029 706F742F 7070612D 61685300 50646572 65666572 636E6572 ...).../top-app.SharedPreferenc
FFFFFFBCF7450000|>52800101 97FD3A05 140000B3 AA1303E0 9400193C B0000F88 90000D89 9137FD08 ...R.:..........<.............7.

trace stack:

-000|strcpy(inline)
-000|trace_event_raw_event_cgroup_migrate
-001|trace_cgroup_attach_task(inline)
-001|cgroup_attach_task()
-002|__read_once_size(inline)
-002|atomic_read(inline)
-002|static_key_count(inline)
-002|static_key_false(inline)
-002|trace_android_vh_cgroup_set_task(inline)
-002|__cgroup1_procs_write()
-003|cgroup1_tasks_write
-004|cgroup_file_write
-005|kernfs_fop_write$
-006|__vfs_write()
-007|vfs_write()
-008|ksys_write()
-009|__se_sys_write(inline)
-009|__arm64_sys_write()
-010|__invoke_syscall(inline)
-010|invoke_syscall(inline)
-010|el0_svc_common()
-011|el0_svc_compat_handler()
-012|el0_svc_compat(asm)





__string() will do a strlen(task->comm) + 1 to allocate on the ring buffer.
It should not be less that task->comm. The above stack dump does not show
what happened.

This looks like another bug and I do not see how this patch addresses
the issue.

-- Steve


Signed-off-by: Kassey Li <quic_yingangl@xxxxxxxxxxx>
---
include/trace/events/cgroup.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/cgroup.h b/include/trace/events/cgroup.h
index dd7d7c9efecd..b4ef0ffa38a4 100644
--- a/include/trace/events/cgroup.h
+++ b/include/trace/events/cgroup.h
@@ -130,7 +130,7 @@ DECLARE_EVENT_CLASS(cgroup_migrate,
__field( u64, dst_id )
__field( int, pid )
__string( dst_path, path )
- __string( comm, task->comm )
+ __array(char, comm, TASK_COMM_LEN)
),
TP_fast_assign(
@@ -139,12 +139,12 @@ DECLARE_EVENT_CLASS(cgroup_migrate,
__entry->dst_level = dst_cgrp->level;
__assign_str(dst_path, path);
__entry->pid = task->pid;
- __assign_str(comm, task->comm);
+ memcpy(__entry->comm, task->comm, TASK_COMM_LEN);
I think the problem is here, __assign_str using strcpy
the task->comm here tail is not '\0'
that's why it out of bounds access.

do you want to this version or just modify the memcpy or strncpy to do with a known length ? please give suggest so I can modify .

),
TP_printk("dst_root=%d dst_id=%llu dst_level=%d dst_path=%s pid=%d comm=%s",
__entry->dst_root, __entry->dst_id, __entry->dst_level,
- __get_str(dst_path), __entry->pid, __get_str(comm))
+ __get_str(dst_path), __entry->pid, __entry->comm)
);
DEFINE_EVENT(cgroup_migrate, cgroup_attach_task,