Re: [PATCH V2 2/3] arch_topology: Avoid use-after-free for scale_freq_data

From: Viresh Kumar
Date: Wed Jun 16 2021 - 07:36:10 EST


Hi Ionela,

On 16-06-21, 12:25, Ionela Voinescu wrote:
> Please correct me if I'm wrong, but my understanding is that this is
> only a problem for the cppc cpufreq invariance functionality. Let's
> consider a scenario where CPUs are either hotplugged out or the cpufreq
> CPPC driver module is removed; topology_clear_scale_freq_source() would
> get called and the sfd_data will be set to NULL. But if at the same
> time topology_scale_freq_tick() got an old reference of sfd_data,
> set_freq_scale() will be called. This is only a problem for CPPC cpufreq
> as cppc_scale_freq_tick() will end up using driver internal data that
> might have been freed during the hotplug callbacks or the exit path.

For now, yes, CPPC is the only one affected.

> If this is the case, wouldn't the synchronisation issue be better
> resolved in the CPPC cpufreq driver, rather than here?

Hmm, the way I see it is that topology_clear_scale_freq_source() is an API
provided by topology core and the topology core needs to guarantee that it
doesn't use the data any longer after topology_clear_scale_freq_source() is
called.

The same is true for other APIs, like:

irq_work_sync();
kthread_cancel_work_sync();

It isn't the user which needs to take this into account, but the API provider.

There may be more users of this in the future, lets say another cpufreq driver,
and so keeping this synchronization at the API provider is the right thing to do
IMHO.

And from the user's perspective, like cppc, it doesn't have any control over who
is using its callback and how and when. It is very very difficult to provide
something like this at the users, redundant anyway. For example cppc won't ever
know when topology_scale_freq_tick() has stopped calling its callback.

For example this is what cppc driver needs to do now:

+static void cppc_cpufreq_stop_cpu(struct cpufreq_policy *policy,
+ unsigned int cpu)
+{
+ struct cppc_freq_invariance *cppc_fi = &per_cpu(cppc_freq_inv, cpu);
+
+ topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC, cpumask_of(cpu));
+ irq_work_sync(&cppc_fi->irq_work);
+ kthread_cancel_work_sync(&cppc_fi->work);
+}

The driver uses APIs provided by 3 layers, topology, irq-work, kthread and all
must provide these guarantees.

A very similar thing is implemented in kernel/sched/cpufreq.c for example.

--
viresh