Re: [PATCH 2/2] SLUB: Disable debugging if it increases the minimumpage order

From: Larry Finger
Date: Thu Jun 11 2009 - 12:49:22 EST


Christoph Lameter wrote:
> On Thu, 11 Jun 2009, Pekka Enberg wrote:
>
>> My main point is that a lot of _testers_ will probably enable all SLUB
>> debugging by default because we encourage them to and it's pretty bad
>> that we end up causing order 1 allocations and oom conditions.
>
> Other test methods (like PAGE_ALLOC debugging) also have significant side
> effects.
>
>> So I still think we need to fix _at minimum_ the kmalloc-4096 case
>> (assuming Larry won't hit the same problem still). I see you're not
>> happy with my patch so any suggestions how to handle that?
>
> Add a warning to Kconfig that the higher order page allocations may
> increase with debugging on for caches with object sizes near or equal to
> PAGE_SIZE?
>
> Its good to run with full debugging on for even the 4k sized caches.
> Otherwise we wont be catching overruns there. But the debugging can cause
> some side effects.

This is obviously an extreme corner case. Both Ubuntu and openSUSE
distribute kernels with SLAB enabled, thus the bulk of users will
never get into this situation. In addition, I'm not aware of any other
testers that have reported this condition.

If the Kconfig warning is too strong, then even the testers might not
turn SLUB debugging on, and you lose people like me. Having a printk
that indicates that debugging was turned off for a particular request
would have been useful for my current test as we would then know that
the fix might have saved an oom condition, but the total number of
such printks should be limited as I wouldn't want my logs polluted too
much.

Is it easy to test the number of available O(1) fragments when
debugging would increase the minimum from 0 to 1, and only turn off
debugging if the available number is small?

After 23 hours of getting my system to a "steady state" condition, I
am currently running the following:

1. Two concurrent -j8 kernel builds with the sources on an NFS mounted
volume with b43 as the network device.
2. A simultaneous flood ping over the network.
3. A dump of the DMA32 line of /proc/buddyinfo every 5 seconds. The
number of O(1) fragments has seldom gotten below 1000.

Larry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/