Re: [PATCH v4 1/3] cacheinfo: Add arch specific early level initializer

From: Radu Rendec
Date: Fri May 19 2023 - 18:02:53 EST


Hi Ricardo,

On Fri, 2023-05-19 at 14:44 -0700, Ricardo Neri wrote:
> On Thu, May 11, 2023 at 03:55:18PM -0400, Radu Rendec wrote:
> > On Wed, 2023-05-10 at 17:00 -0700, Ricardo Neri wrote:
> > > On Wed, May 10, 2023 at 04:44:49PM -0400, Radu Rendec wrote:
> > > > On Wed, 2023-05-10 at 12:12 -0700, Ricardo Neri wrote:
> > > > > On Wed, Apr 12, 2023 at 02:57:57PM -0400, Radu Rendec wrote:
> > > > > > This patch gives architecture specific code the ability to initialize
> > > > > > the cache level and allocate cacheinfo memory early, when cache level
> > > > > > initialization runs on the primary CPU for all possible CPUs.
> > > > [cut]
> > > > > > -int detect_cache_attributes(unsigned int cpu)
> > > > > > +static inline int init_level_allocate_ci(unsigned int cpu)
> > > > > >  {
> > > > > > -       int ret;
> > > > > > +       unsigned int early_leaves = cache_leaves(cpu);
> > > > > >  
> > > > > >         /* Since early initialization/allocation of the cacheinfo is allowed
> > > > > >          * via fetch_cache_info() and this also gets called as CPU hotplug
> > > > > >          * callbacks via cacheinfo_cpu_online, the init/alloc can be skipped
> > > > > >          * as it will happen only once (the cacheinfo memory is never freed).
> > > > > > -        * Just populate the cacheinfo.
> > > > > > +        * Just populate the cacheinfo. However, if the cacheinfo has been
> > > > > > +        * allocated early through the arch-specific early_cache_level() call,
> > > > > > +        * there is a chance the info is wrong (this can happen on arm64). In
> > > > > > +        * that case, call init_cache_level() anyway to give the arch-specific
> > > > > > +        * code a chance to make things right.
> > > > > >          */
> > > > > > -       if (per_cpu_cacheinfo(cpu))
> > > > > > -               goto populate_leaves;
> > > > > > +       if (per_cpu_cacheinfo(cpu) && !ci_cacheinfo(cpu)->early_ci_levels)
> > > > > > +               return 0;
> > > > > >  
> > > > > >         if (init_cache_level(cpu) || !cache_leaves(cpu))
> > > > > >                 return -ENOENT;
> > > > > >  
> > > > > > -       ret = allocate_cache_info(cpu);
> > > > > > +       /*
> > > > > > +        * Now that we have properly initialized the cache level info, make
> > > > > > +        * sure we don't try to do that again the next time we are called
> > > > > > +        * (e.g. as CPU hotplug callbacks).
> > > > > > +        */
> > > > > > +       ci_cacheinfo(cpu)->early_ci_levels = false;
> > > > > > +
> > > > > > +       if (cache_leaves(cpu) <= early_leaves)
> > > > > > +               return 0;
> > > > > > +
> > > > >
> > > > > I had posted a patchset[1] for x86 that initializes
> > > > > ci_cacheinfo(cpu)->num_leaves during SMP boot.
> > > > >
> > > > > This means that early_leaves and a late cache_leaves() are equal but
> > > > > per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
> > > > > fetch_cache_info().
> > > > >
> > > > > I think that we should check here that per_cpu_cacheinfo() has been allocated to
> > > > > take care of the case in which early and late cache leaves remain the same:
> > > > >
> > > > > -       if (cache_leaves(cpu) <= early_leaves)
> > > > > +       if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))
> > > > >
> > > > > Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
> > > > > last_level_cache_is_valid().
> > > > >
> > > > > I can post a patch with this fix if it makes sense.
> > > > >
> > > > > [1]. https://lore.kernel.org/all/20230424001956.21434-3-ricardo.neri-calderon@xxxxxxxxxxxxxxx/
> > > >
> > > > Thanks for bringing this to my attention. I need to run some tests on
> > > > x86 (I did all that work/testing on arm64) and wrap my head around it.
> > > >
> > > > While I don't see any problem with the fix you're proposing, I'm afraid
> > > > it may circle back to the other problem I tried to fix initially. Have
> > > > you tested this on an RT kernel by any chance?
> > >
> > > That is a good point. I did not test on an RT kernel. I'll try that.
> >
> > It looks like the flow is much simpler on x86: detect_cache_attributes()
> > is called only once for each CPU, and it's called in kthread context.
> >
> > I haven't tested on an RT kernel but I think it should be fine. I put a
> > msleep() there and saw no issues, which means kmalloc() on RT should be
> > fine as well.
>
> I booted the realtime kernel [3] with CONFIG_PREEMPT_RT and did not observe
> the BUG splat. I tried before your patchset. Were you able to reproduce on
> x86? Also, I was not able to reproduce the BUG splat after your changes +
> [1] + my earlier suggested patch in this thread.

Thanks for trying this out. I think the BUG splat cannot be reproduced
on x86, either with or without my fix because detect_cache_attributes()
is always called in kthread context, and that makes kmalloc() happy on
RT kernels.

At the time when I first asked you about the RT kernel, I hadn't looked
closely at the x86 code yet, and I was unaware of the (much simpler)
flow and the kthread context. Sorry for the confusion!

In any case, your test on the RT kernel validates the conclusion that
we already came to, and eliminates any trace of doubt. FWIW, I was
testing on arm64 when I found the bug and created the fix. Things are
much more complicated there, and detect_cache_attributes() is called
twice for each CPU (the second time with preemption disabled).

Best regards,
Radu