Re: SLUB BUG: check_slab called with interrupts enabled

From: Christoph Lameter
Date: Wed Jun 15 2011 - 11:45:13 EST


On Wed, 15 Jun 2011, Rik van Riel wrote:

> There are no additional special slub patches applied right? Because some
> of the patches under discussion change the interrupt disable handling a
> bit.

Just the two attached ones, which don't seem to touch the
code path in question...

I also do not see how these could break something. But they are mucking
around with the __GFP_WAIT flag. __GFP_WAIT determines the reenabling and
redisabling of interrupts in __slab_alloc(). If some variables gets
corrupted then this could be the result.

Print out the value of gfpflags before and after the call to new_slab()
from __slab_alloc()?From linux-fsdevel-owner@xxxxxxxxxxxxxxx Fri May 13 10:04:18 2011
From: Mel Gorman <mgorman@xxxxxxx>
To: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>,
Colin King <colin.king@xxxxxxxxxxxxx>,
Raghavendra D Prabhu <raghu.prabhu13@xxxxxxxxx>,
Jan Kara <jack@xxxxxxx>, Chris Mason <chris.mason@xxxxxxxxxx>,
Christoph Lameter <cl@xxxxxxxxx>,
Pekka Enberg <penberg@xxxxxxxxxx>,
Rik van Riel <riel@xxxxxxxxxx>,
Johannes Weiner <hannes@xxxxxxxxxxx>,
linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>,
linux-mm <linux-mm@xxxxxxxxx>,
linux-kernel <linux-kernel@xxxxxxxxxxxxxxx>,
linux-ext4 <linux-ext4@xxxxxxxxxxxxxxx>,
Mel Gorman <mgorman@xxxxxxx>
Subject: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations
Date: Fri, 13 May 2011 15:03:23 +0100
Message-Id: <1305295404-12129-4-git-send-email-mgorman@xxxxxxx>
X-Mailing-List: linux-fsdevel@xxxxxxxxxxxxxxx

To avoid locking and per-cpu overhead, SLUB optimisically uses
high-order allocations and falls back to lower allocations if they
fail. However, by simply trying to allocate, the caller can enter
compaction or reclaim - both of which are likely to cost more than the
benefit of using high-order pages in SLUB. On a desktop system, two
users report that the system is getting stalled with kswapd using large
amounts of CPU.

This patch prevents SLUB taking any expensive steps when trying to use
high-order allocations. Instead, it is expected to fall back to smaller
orders more aggressively. Testing was somewhat inconclusive on how much
this helped but it makes sense that falling back to order-0 allocations
is faster than entering compaction or direct reclaim.

Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
---
mm/page_alloc.c | 3 ++-
mm/slub.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9f8a97b..057f1e2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1972,6 +1972,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
{
int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
const gfp_t wait = gfp_mask & __GFP_WAIT;
+ const gfp_t can_wake_kswapd = !(gfp_mask & __GFP_NO_KSWAPD);

/* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
@@ -1984,7 +1985,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
*/
alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);

- if (!wait) {
+ if (!wait && can_wake_kswapd) {
/*
* Not worth trying to allocate harder for
* __GFP_NOMEMALLOC even if it can't schedule.
diff --git a/mm/slub.c b/mm/slub.c
index 98c358d..c5797ab 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1170,7 +1170,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
* Let the initial higher-order allocation fail under memory pressure
* so we fall-back to the minimum order allocation.
*/
- alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL;
+ alloc_gfp = (flags | __GFP_NOWARN | __GFP_NO_KSWAPD) &
+ ~(__GFP_NOFAIL | __GFP_WAIT | __GFP_REPEAT);

page = alloc_slab_page(alloc_gfp, node, oo);
if (unlikely(!page)) {
--
1.7.3.4
From linux-fsdevel-owner@xxxxxxxxxxxxxxx Fri May 13 10:04:00 2011
From: Mel Gorman <mgorman@xxxxxxx>
To: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>,
Colin King <colin.king@xxxxxxxxxxxxx>,
Raghavendra D Prabhu <raghu.prabhu13@xxxxxxxxx>,
Jan Kara <jack@xxxxxxx>, Chris Mason <chris.mason@xxxxxxxxxx>,
Christoph Lameter <cl@xxxxxxxxx>,
Pekka Enberg <penberg@xxxxxxxxxx>,
Rik van Riel <riel@xxxxxxxxxx>,
Johannes Weiner <hannes@xxxxxxxxxxx>,
linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>,
linux-mm <linux-mm@xxxxxxxxx>,
linux-kernel <linux-kernel@xxxxxxxxxxxxxxx>,
linux-ext4 <linux-ext4@xxxxxxxxxxxxxxx>,
Mel Gorman <mgorman@xxxxxxx>
Subject: [PATCH 2/4] mm: slub: Do not wake kswapd for SLUBs speculative high-order allocations
Date: Fri, 13 May 2011 15:03:22 +0100
Message-Id: <1305295404-12129-3-git-send-email-mgorman@xxxxxxx>
X-Mailing-List: linux-fsdevel@xxxxxxxxxxxxxxx

To avoid locking and per-cpu overhead, SLUB optimisically uses
high-order allocations and falls back to lower allocations if they
fail. However, by simply trying to allocate, kswapd is woken up to
start reclaiming at that order. On a desktop system, two users report
that the system is getting locked up with kswapd using large amounts
of CPU. Using SLAB instead of SLUB made this problem go away.

This patch prevents kswapd being woken up for high-order allocations.
Testing indicated that with this patch applied, the system was much
harder to hang and even when it did, it eventually recovered.

Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
---
mm/slub.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 9d2e5e4..98c358d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1170,7 +1170,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
* Let the initial higher-order allocation fail under memory pressure
* so we fall-back to the minimum order allocation.
*/
- alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
+ alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL;

page = alloc_slab_page(alloc_gfp, node, oo);
if (unlikely(!page)) {
--
1.7.3.4