Re: [PATCH 2/2] jump_label: refine placement of static_keys

From: Ard Biesheuvel
Date: Wed Nov 10 2021 - 12:06:45 EST


On Wed, 10 Nov 2021 at 16:22, Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>
> On Wed, Nov 10, 2021 at 2:24 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> >
> > On Wed, 10 Nov 2021 at 09:36, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Tue, Nov 09, 2021 at 05:09:06PM -0800, Eric Dumazet wrote:
> > > > From: Eric Dumazet <edumazet@xxxxxxxxxx>
> > > >
> > > > With CONFIG_JUMP_LABEL=y, "struct static_key" content is only
> > > > used for the control path.
> > > >
> > > > Marking them __read_mostly is only needed when CONFIG_JUMP_LABEL=n.
> > > > Otherwise we place them out of the way to increase data locality.
> > > >
> > > > This patch adds __static_key to centralize this new policy.
> > > >
> > > > Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx>
> > > > ---
> > > > arch/x86/kvm/lapic.c | 4 ++--
> > > > arch/x86/kvm/x86.c | 2 +-
> > > > include/linux/jump_label.h | 25 +++++++++++++++++--------
> > > > kernel/events/core.c | 2 +-
> > > > kernel/sched/fair.c | 2 +-
> > > > net/core/dev.c | 8 ++++----
> > > > net/netfilter/core.c | 2 +-
> > > > net/netfilter/x_tables.c | 2 +-
> > > > 8 files changed, 28 insertions(+), 19 deletions(-)
> > > >
> > >
> > > Hurmph, it's a bit cumbersome to always have to add this __static_key
> > > attribute to every definition, and in fact you seem to have missed some.
> > >
> > > Would something like:
> > >
> > > typedef struct static_key __static_key static_key_t;
> > >
> > > work? I forever seem to forget the exact things you can make a typedef
> > > do :/
> >
> > No, that doesn't work. Section placement is an attribute of the symbol
> > not of its type. So we'll need to macro'ify this.
>
> Yes, this is also why I chose a short __static_key (initially I was
> using something more descriptive but longer)
>
> >
> > But I'm not sure I understand why we need different policies here.
> > Static keys are inherently __read_mostly (unless they are not writable
> > to begin with), so keeping them all together in one place in the
> > binary should be sufficient, no?
>
> It is not optimal for CONFIG_JUMP_LABEL=n cases.
>
> For instance, networking will prefer having rps_needed / rfs_needed in
> the same cache lines than other hot read_mostly stuff,
> instead of being far away in other locations.
>
> ffffffff830e0f80 D dev_weight_tx_bias
> ffffffff830e0f84 D dev_rx_weight
> ffffffff830e0f88 D dev_tx_weight
> ffffffff830e0f8c D gro_normal_batch
> ffffffff830e0f90 D rps_sock_flow_table
> ffffffff830e0f98 D rps_cpu_mask
> ffffffff830e0f9c D rps_needed
> ffffffff830e0fa0 D rfs_needed
> ffffffff830e0fa4 D netdev_flow_limit_table_len
> ffffffff830e0fa8 d netif_napi_add.__print_once
> ffffffff830e0fac D netdev_unregister_timeout_secs
> ffffffff830e0fb0 D ptype_base
>
>
> When CONFIG_JUMP_LABEL=y, rps_needed/xps_needed being in a remote
> location is a win because it 'saves' 32 bytes than can be used better

I understand that you want the key out of the way for
CONFIG_JUMP_LABEL=n, but the question was why we shouldn't do that
unconditionally. If we put all the keys together in a section, they
will only share cachelines with each other.

Also, what is the performance impact on a real world use case of this change?