Re: [next] arm64: boot failed - next-20220606

From: Naresh Kamboju
Date: Fri Jun 10 2022 - 06:59:20 EST


Hi Roman,

On Thu, 9 Jun 2022 at 22:57, Roman Gushchin <roman.gushchin@xxxxxxxxx> wrote:
>
> On Tue, Jun 07, 2022 at 11:00:39AM +0530, Naresh Kamboju wrote:
> > On Mon, 6 Jun 2022 at 17:16, Naresh Kamboju <naresh.kamboju@xxxxxxxxxx> wrote:
> > >
> > > Linux next-20220606 arm64 boot failed. The kernel boot log is empty.
> > > I am bisecting this problem.
> > >
> > > Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
> > >
> > > The initial investigation show that,
> > >
> > > GOOD: next-20220603
> > > BAD: next-20220606
> > >
> > > Boot log:
> > > Starting kernel ...
> >
> > Linux next-20220606 and next-20220607 arm64 boot failed.
> > The kernel panic log showing after earlycon.
> >
> > Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
>
> Naresh, can you, please, check if the following patch resolves the issue?
> (completely untested except for building)

I have tested this patch on top of next-20220606 and boot successfully [1].

Tested-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>

> --
>
> From 6a454876c9a1886e3cf8e9b66dae19b326f8901a Mon Sep 17 00:00:00 2001
> From: Roman Gushchin <roman.gushchin@xxxxxxxxx>
> Date: Thu, 9 Jun 2022 10:03:20 -0700
> Subject: [PATCH] mm: kmem: make mem_cgroup_from_obj() vmalloc()-safe
>
> Currently mem_cgroup_from_obj() is not working properly with objects
> allocated using vmalloc(). It creates problems in some cases, when
> it's called for static objects belonging to modules or generally
> allocated using vmalloc().
>
> This patch makes mem_cgroup_from_obj() safe to be called on objects
> allocated using vmalloc().
>
> It also introduces mem_cgroup_from_slab_obj(), which is a faster
> version to use in places when we know the object is either a slab
> object or a generic slab page (e.g. when adding an object to a lru
> list).
>
> Suggested-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
> Signed-off-by: Roman Gushchin <roman.gushchin@xxxxxxxxx>
> ---
> include/linux/memcontrol.h | 6 ++++
> mm/list_lru.c | 2 +-
> mm/memcontrol.c | 71 +++++++++++++++++++++++++++-----------
> 3 files changed, 57 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 0d7584e2f335..4d31ce55b1c0 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -1761,6 +1761,7 @@ static inline int memcg_kmem_id(struct mem_cgroup *memcg)
> }
>
> struct mem_cgroup *mem_cgroup_from_obj(void *p);
> +struct mem_cgroup *mem_cgroup_from_slab_obj(void *p);
>
> static inline void count_objcg_event(struct obj_cgroup *objcg,
> enum vm_event_item idx)
> @@ -1858,6 +1859,11 @@ static inline struct mem_cgroup *mem_cgroup_from_obj(void *p)
> return NULL;
> }
>
> +static inline struct mem_cgroup *mem_cgroup_from_slab_obj(void *p)
> +{
> + return NULL;
> +}
> +
> static inline void count_objcg_event(struct obj_cgroup *objcg,
> enum vm_event_item idx)
> {
> diff --git a/mm/list_lru.c b/mm/list_lru.c
> index ba76428ceece..a05e5bef3b40 100644
> --- a/mm/list_lru.c
> +++ b/mm/list_lru.c
> @@ -71,7 +71,7 @@ list_lru_from_kmem(struct list_lru *lru, int nid, void *ptr,
> if (!list_lru_memcg_aware(lru))
> goto out;
>
> - memcg = mem_cgroup_from_obj(ptr);
> + memcg = mem_cgroup_from_slab_obj(ptr);
> if (!memcg)
> goto out;
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 4093062c5c9b..8c408d681377 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -783,7 +783,7 @@ void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val)
> struct lruvec *lruvec;
>
> rcu_read_lock();
> - memcg = mem_cgroup_from_obj(p);
> + memcg = mem_cgroup_from_slab_obj(p);
>
> /*
> * Untracked pages have no memcg, no lruvec. Update only the
> @@ -2833,27 +2833,9 @@ int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s,
> return 0;
> }
>
> -/*
> - * Returns a pointer to the memory cgroup to which the kernel object is charged.
> - *
> - * A passed kernel object can be a slab object or a generic kernel page, so
> - * different mechanisms for getting the memory cgroup pointer should be used.
> - * In certain cases (e.g. kernel stacks or large kmallocs with SLUB) the caller
> - * can not know for sure how the kernel object is implemented.
> - * mem_cgroup_from_obj() can be safely used in such cases.
> - *
> - * The caller must ensure the memcg lifetime, e.g. by taking rcu_read_lock(),
> - * cgroup_mutex, etc.
> - */
> -struct mem_cgroup *mem_cgroup_from_obj(void *p)
> +static __always_inline
> +struct mem_cgroup *mem_cgroup_from_obj_folio(struct folio *folio, void *p)
> {
> - struct folio *folio;
> -
> - if (mem_cgroup_disabled())
> - return NULL;
> -
> - folio = virt_to_folio(p);
> -
> /*
> * Slab objects are accounted individually, not per-page.
> * Memcg membership data for each individual object is saved in
> @@ -2886,6 +2868,53 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p)
> return page_memcg_check(folio_page(folio, 0));
> }
>
> +/*
> + * Returns a pointer to the memory cgroup to which the kernel object is charged.
> + *
> + * A passed kernel object can be a slab object, vmalloc object or a generic
> + * kernel page, so different mechanisms for getting the memory cgroup pointer
> + * should be used.
> + *
> + * In certain cases (e.g. kernel stacks or large kmallocs with SLUB) the caller
> + * can not know for sure how the kernel object is implemented.
> + * mem_cgroup_from_obj() can be safely used in such cases.
> + *
> + * The caller must ensure the memcg lifetime, e.g. by taking rcu_read_lock(),
> + * cgroup_mutex, etc.
> + */
> +struct mem_cgroup *mem_cgroup_from_obj(void *p)
> +{
> + struct folio *folio;
> +
> + if (mem_cgroup_disabled())
> + return NULL;
> +
> + if (unlikely(is_vmalloc_addr(p)))
> + folio = page_folio(vmalloc_to_page(p));
> + else
> + folio = virt_to_folio(p);
> +
> + return mem_cgroup_from_obj_folio(folio, p);
> +}
> +
> +/*
> + * Returns a pointer to the memory cgroup to which the kernel object is charged.
> + * Similar to mem_cgroup_from_obj(), but faster and not suitable for objects,
> + * allocated using vmalloc().
> + *
> + * A passed kernel object must be a slab object or a generic kernel page.
> + *
> + * The caller must ensure the memcg lifetime, e.g. by taking rcu_read_lock(),
> + * cgroup_mutex, etc.
> + */
> +struct mem_cgroup *mem_cgroup_from_slab_obj(void *p)
> +{
> + if (mem_cgroup_disabled())
> + return NULL;
> +
> + return mem_cgroup_from_obj_folio(virt_to_folio(p), p);
> +}
> +
> static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg)
> {
> struct obj_cgroup *objcg = NULL;
> --
> 2.35.3

[1] https://lkft.validation.linaro.org/scheduler/job/5156201

--
Linaro LKFT
https://lkft.linaro.org