Re: [PATCH v3 1/2] cacheinfo: Add arch specific early level initializer

From: Radu Rendec
Date: Wed Apr 12 2023 - 09:31:52 EST


On Wed, 2023-04-12 at 12:36 +0100, Sudeep Holla wrote:
> On Thu, Apr 06, 2023 at 07:39:25PM -0400, Radu Rendec wrote:
> > This patch gives of architecture specific code the ability to initialize
> > the cache level and allocate cacheinfo memory early, when cache level
> > initialization runs on the primary CPU for all possible CPUs.
> >
> > This is part of a patch series that attempts to further the work in
> > commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU").
> > Previously, in the absence of any DT/ACPI cache info, architecture
> > specific cache detection and info allocation for secondary CPUs would
> > happen in non-preemptible context during early CPU initialization and
> > trigger a "BUG: sleeping function called from invalid context" splat on
> > an RT kernel.
> >
> > More specifically, this patch adds the early_cache_level() function,
> > which is called by fetch_cache_info() as a fallback when the number of
> > cache leaves cannot be extracted from DT/ACPI. In the default generic
> > (weak) implementation, this new function returns -ENOENT, which
> > preserves the original behavior for architectures that do not implement
> > the function.
> >
> > Since early detection can get the number of cache leaves wrong in some
> > cases*, additional logic is added to still call init_cache_level() later
> > on the secondary CPU, therefore giving the architecture specific code an
> > opportunity to go back and fix the initial guess. Again, the original
> > behavior is preserved for architectures that do not implement the new
> > function.
> >
> > * For example, on arm64, CLIDR_EL1 detection works only when it runs on
> >   the current CPU. In other words, a CPU cannot detect the cache depth
> >   for any other CPU than itself.
> >
>
> Thanks for the detailed description and putting this together.

No problem. Happy to help!

> > Signed-off-by: Radu Rendec <rrendec@xxxxxxxxxx>
> > ---
> >  drivers/base/cacheinfo.c  | 75 +++++++++++++++++++++++++++------------
> >  include/linux/cacheinfo.h |  2 ++
> >  2 files changed, 55 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> > index f6573c335f4c..30f5553d3ebb 100644
> > --- a/drivers/base/cacheinfo.c
> > +++ b/drivers/base/cacheinfo.c
> > @@ -398,6 +398,11 @@ static void free_cache_attributes(unsigned int cpu)
> >         cache_shared_cpu_map_remove(cpu);
> >  }
> >  
> > +int __weak early_cache_level(unsigned int cpu)
> > +{
> > +       return -ENOENT;
> > +}
> > +
> >  int __weak init_cache_level(unsigned int cpu)
> >  {
> >         return -ENOENT;
> > @@ -423,56 +428,82 @@ int allocate_cache_info(int cpu)
> >  
> >  int fetch_cache_info(unsigned int cpu)
> >  {
> > -       struct cpu_cacheinfo *this_cpu_ci;
> > +       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> >         unsigned int levels = 0, split_levels = 0;
> >         int ret;
> >  
> >         if (acpi_disabled) {
> >                 ret = init_of_cache_level(cpu);
> > -               if (ret < 0)
> > -                       return ret;
> >         } else {
> >                 ret = acpi_get_cache_info(cpu, &levels, &split_levels);
> > -               if (ret < 0)
> > +               if (!ret) {
> > +                       this_cpu_ci->num_levels = levels;
> > +                       /*
> > +                        * This assumes that:
> > +                        * - there cannot be any split caches (data/instruction)
> > +                        *   above a unified cache
> > +                        * - data/instruction caches come by pair
> > +                        */
> > +                       this_cpu_ci->num_leaves = levels + split_levels;
> > +               }
> > +       }
> > +
> > +       if (ret || !cache_leaves(cpu)) {
> > +               ret = early_cache_level(cpu);
> > +               if (ret)
> >                         return ret;
> >  
> > -               this_cpu_ci = get_cpu_cacheinfo(cpu);
> > -               this_cpu_ci->num_levels = levels;
> > -               /*
> > -                * This assumes that:
> > -                * - there cannot be any split caches (data/instruction)
> > -                *   above a unified cache
> > -                * - data/instruction caches come by pair
> > -                */
> > -               this_cpu_ci->num_leaves = levels + split_levels;
> > +               if (!cache_leaves(cpu))
> > +                       return -ENOENT;
> > +
> > +               this_cpu_ci->early_arch_info = true;
> >         }
> > -       if (!cache_leaves(cpu))
> > -               return -ENOENT;
> >  
> >         return allocate_cache_info(cpu);
> >  }
> >  
> > -int detect_cache_attributes(unsigned int cpu)
> > +static inline int init_level_allocate_ci(unsigned int cpu)
> >  {
> > -       int ret;
> > +       unsigned int early_leaves = cache_leaves(cpu);
> >  
> >         /* Since early initialization/allocation of the cacheinfo is allowed
> >          * via fetch_cache_info() and this also gets called as CPU hotplug
> >          * callbacks via cacheinfo_cpu_online, the init/alloc can be skipped
> >          * as it will happen only once (the cacheinfo memory is never freed).
> > -        * Just populate the cacheinfo.
> > +        * Just populate the cacheinfo. However, if the cacheinfo has been
> > +        * allocated early through the arch-specific early_cache_level() call,
> > +        * there is a chance the info is wrong (this can happen on arm64). In
> > +        * that case, call init_cache_level() anyway to give the arch-specific
> > +        * code a chance to make things right.
> >          */
> > -       if (per_cpu_cacheinfo(cpu))
> > -               goto populate_leaves;
> > +       if (per_cpu_cacheinfo(cpu) && !ci_cacheinfo(cpu)->early_arch_info)
> > +               return 0;
> >  
> >         if (init_cache_level(cpu) || !cache_leaves(cpu))
> >                 return -ENOENT;
> >  
> > -       ret = allocate_cache_info(cpu);
> > +       /*
> > +        * Now that we have properly initialized the cache level info, make
> > +        * sure we don't try to do that again the next time we are called
> > +        * (e.g. as CPU hotplug callbacks).
> > +        */
> > +       ci_cacheinfo(cpu)->early_arch_info = false;
>
> I am wondering if it makes sense to rename this as early_ci_levels or
> something similar to indicate it is to do with just level information only ?
> If not, it needs to be documented if the variable is not more specific.
> I am sure I will forget it and will be wondering to understand in few
> months time 😄.

Now that you mentioned it, I think it make perfect sense to rename it.
I like early_ci_levels, I will use that in v4.

> Other than that, it looks good. I will try to push this for v6.4 but it
> may be bit late as it is good to have it in -next for sometime to get more
> testing. Anyways send v4, will put it into -next ASAP and see what is the
> best course of action after that.

Sounds great. Thanks for reviewing the patches and for your input!

Best regards,
Radu