Re: [RFC 0/2] An attempt to improve SLUB on NUMA / under memory pressure

From: Jay Patel
Date: Fri Aug 18 2023 - 03:12:26 EST


On Fri, 2023-08-11 at 03:06 +0900, Hyeonggon Yoo wrote:
> On Thu, Aug 10, 2023 at 7:56 PM Jay Patel <jaypatel@xxxxxxxxxxxxx>
> wrote:
> > On Mon, 2023-07-24 at 04:09 +0900, Hyeonggon Yoo wrote:
> > > Hello folks,
> > >
> > > This series is motivated by kernel test bot report [1] on Jay's
> > > patch
> > > that modifies slab order. While the patch was not merged and not
> > > in
> > > the
> > > final form, I think it was a good lesson that changing slab order
> > > has
> > > more
> > > impacts on performance than we expected.
> > >
> > > While inspecting the report, I found some potential points to
> > > improve
> > > SLUB. [2] It's _potential_ because it shows no improvements on
> > > hackbench.
> > > but I believe more realistic workloads would benefit from this.
> > > Due
> > > to
> > > lack of resources and lack of my understanding of *realistic*
> > > workloads,
> > > I am asking you to help evaluating this together.
> >
> > Hi Hyeonggon,
> > I tried hackbench test on Powerpc machine with 16 cpus but
> > got ~32% of Regression with patch.
>
> Thank you so much for measuring this! That's very helpful.
> It's interesting because on an AMD machine with 2 NUMA nodes there
> was
> not much difference.
>
> Does it have more than one socket?

I have tested on single socket system.
>
> Could you confirm if the offending patch is patch 1 or 2?
> If the offending one is patch 2, can you please check how large is L3
> cache miss rate
> during hackbench?
>
Below regression is cause by Patch 1 "Revert mm, slub: change percpu
partial accounting from objects to pages"

Thanks
Jay Patel

> > Results as
> >
> > +-------+----+---------+------------+------------+
> > > | | Normal | With Patch | |
> > +-------+----+---------+------------+------------+
> > > Amean | 1 | 1.3700 | 2.0353 | ( -32.69%) |
> > > Amean | 4 | 5.1663 | 7.6563 | (- 32.52%) |
> > > Amean | 7 | 8.9180 | 13.3353 | ( -33.13%) |
> > > Amean | 12 | 15.4290 | 23.0757 | ( -33.14%) |
> > > Amean | 21 | 27.3333 | 40.7823 | ( -32.98%) |
> > > Amean | 30 | 38.7677 | 58.5300 | ( -33.76%) |
> > > Amean | 48 | 62.2987 | 92.9850 | ( -33.00%) |
> > > Amean | 64 | 82.8993 | 123.4717 | ( -32.86%) |
> > +-------+----+---------+------------+------------+
> >
> > Thanks
> > Jay Patel
> > > It only consists of two patches. Patch #1 addresses inaccuracy in
> > > SLUB's heuristic, which can negatively affect workloads'
> > > performance
> > > when large folios are not available from buddy.
> > >
> > > Patch #2 changes SLUB's behavior when there are no slabs
> > > available on
> > > the
> > > local node's partial slab list, increasing NUMA locality when
> > > there
> > > are
> > > available memory (without reclamation) on the local node from
> > > buddy.
> > >
> > > This is early state, but I think it's a good enough to start
> > > discussion.
> > > Any feedbacks and ideas are welcome. Thank you in advance!
> > >
> > > Hyeonggon
> > >
> > > https://lore.kernel.org/linux-mm/202307172140.3b34825a-oliver.sang@xxxxxxxxx
> > > [1]
> > > https://lore.kernel.org/linux-mm/CAB=+i9S6Ykp90+4N1kCE=hiTJTE4wzJDi8k5pBjjO_3sf0aeqg@xxxxxxxxxxxxxx
> > > [2]
> > >
> > > Hyeonggon Yoo (2):
> > > Revert "mm, slub: change percpu partial accounting from objects
> > > to
> > > pages"
> > > mm/slub: prefer NUMA locality over slight memory saving on NUMA
> > > machines
> > >
> > > include/linux/slub_def.h | 2 --
> > > mm/slab.h | 6 ++++
> > > mm/slub.c | 76 ++++++++++++++++++++++++++------
> > > ----
> > > ----
> > > 3 files changed, 55 insertions(+), 29 deletions(-)
> > >