Re: [PATCH next 4/5] locking/osq_lock: Optimise per-cpu data accesses.

From: Ingo Molnar
Date: Sat Dec 30 2023 - 15:37:46 EST



* David Laight <David.Laight@xxxxxxxxxx> wrote:

> bool osq_lock(struct optimistic_spin_queue *lock)
> {
> - struct optimistic_spin_node *node = this_cpu_ptr(&osq_node);
> + struct optimistic_spin_node *node = raw_cpu_read(osq_node.self);
> struct optimistic_spin_node *prev, *next;
> int old;
>
> - if (unlikely(node->cpu == OSQ_UNLOCKED_VAL))
> - node->cpu = encode_cpu(smp_processor_id());
> + if (unlikely(!node)) {
> + int cpu = encode_cpu(smp_processor_id());
> + node = decode_cpu(cpu);
> + node->self = node;
> + node->cpu = cpu;

This whole initialization sequence is suboptimal and needs to be
cleaned up first: the node->cpu field is constant once initialized, so
it should be initialized from appropriate init methods, not runtime in
osq_lock(), right?

Eliminating that initialization branch is a useful micro-optimization
in itself for the hot path.

Thanks,

Ingo