Re: [PATCH] ftrace: Fix NULL pointer dereference in is_ftrace_trampoline when ftrace is dead

From: Steven Rostedt
Date: Wed Aug 17 2022 - 22:14:21 EST


On Thu, 18 Aug 2022 09:50:40 +0800
Yang Jihong <yangjihong1@xxxxxxxxxx> wrote:

> Thanks for the detailed explanation.
> If panic_on_warn is not set, FTRACE_WARN_ON{_ONCE} only sets
> ftrace_disabled, but will not reboot.

Correct. But whenever there's a WARN_ON() the administrator of the machine
should think about rebooting it ASAP. That's because all WARN_ON()s are
suppose to only happen when the system does something that was not
expected, putting it into an inconsistent state. And could be a dangerous
one. This is why all WARN_ON()s that are triggered are considered bugs and
must be fixed.


> I think this is to limit the problem to ftrace itself and not spread to
> other subsystems(I don't know if that's right. If it's not right, please
> correct it).

Yes, the ftrace_disable means that ftrace just found itself in a situation
that it does not understand, and nothing can be trusted. As ftrace modifies
kernel code, it basically stops everything and WARNs about it. Because
anything else it does can make things worse.

> Because is_ftrace_trampoline is a common and public interface (This
> interface is called in many places in the kernel).
> If is_ftrace_trampoline interface is not restricted (for example, just
> return true if ftrace_disabled is set), the preceding Syzkaller scenario
> may be triggered when this interface is called.

If ftrace_disabled is set, then any operations should fail, and any tests
should fail with it.

>
> Therefore, my idea is to restrict the is_ftrace_trampoline or roll back
> _unregister_ftrace_function when ftrace_disabled is set, so that the
> interface can be invoked normally. Or keep the current code and do not
> modify.

Once ftrace_disabled is set, none of its interfaces should perform
normally.

But you reported that you could hit a NULL pointer from the
is_ftrace_trampoline() which was caused by the failure adding the dynamic
trampoline, and then the ops is on the list but later freed.

My suggestion above is to just call _unregister_ftrace_function(ops) to
take it off the list and prevent the NULL pointer.

Doesn't that fix the bug?

I don't want to totally roll it back and free the trampoline, because those
actions could cause further damage, depending on the failed state ftrace is
in.

-- Steve