Re: [PATCH RESEND] slab: introduce the flag SLAB_MINIMIZE_WASTE

From: Mikulas Patocka
Date: Thu Apr 26 2018 - 17:10:06 EST




On Thu, 26 Apr 2018, Christopher Lameter wrote:

> On Wed, 25 Apr 2018, Mikulas Patocka wrote:
>
> > Do you want this? It deletes slab_order and replaces it with the
> > "minimize_waste" logic directly.
>
> Well yes that looks better. Now we need to make it easy to read and less
> complicated. Maybe try to keep as much as possible of the old code
> and also the names of variables to make it easier to review?
>
> > It simplifies the code and it is very similar to the old algorithms, most
> > slab caches have the same order, so it shouldn't cause any regressions.
> >
> > This patch changes order of these slabs:
> > TCPv6: 3 -> 4
> > sighand_cache: 3 -> 4
> > task_struct: 3 -> 4
>
> Hmmm... order 4 for these caches may cause some concern. These should stay
> under costly order I think. Otherwise allocations are no longer
> guaranteed.

You said that slub has fallback to smaller order allocations.

The whole purpose of this "minimize waste" approach is to use higher-order
allocations to use memory more efficiently, so it is just doing its job.
(for these 3 caches, order-4 really wastes less memory than order-3 - on
my system TCPv6 and sighand_cache have size 2112, task_struct 2752).

We could improve the fallback code, so that if order-4 allocation fails,
it tries order-3 allocation, and then falls back to order-0. But I think
that these failures are rare enough that it is not a problem.

> > @@ -3269,35 +3245,35 @@ static inline int calculate_order(unsign
> > max_objects = order_objects(slub_max_order, size, reserved);
> > min_objects = min(min_objects, max_objects);
> >
> > - while (min_objects > 1) {
> > - unsigned int fraction;
> > + /* Get the minimum acceptable order for one object */
> > + order = get_order(size + reserved);
> > +
> > + for (test_order = order + 1; test_order < MAX_ORDER; test_order++) {
> > + unsigned order_obj = order_objects(order, size, reserved);
> > + unsigned test_order_obj = order_objects(test_order, size, reserved);
> > +
> > + /* If there are too many objects, stop searching */
> > + if (test_order_obj > MAX_OBJS_PER_PAGE)
> > + break;
> >
> > - fraction = 16;
> > - while (fraction >= 4) {
> > - order = slab_order(size, min_objects,
> > - slub_max_order, fraction, reserved);
> > - if (order <= slub_max_order)
> > - return order;
> > - fraction /= 2;
> > - }
> > - min_objects--;
> > + /* Always increase up to slub_min_order */
> > + if (test_order <= slub_min_order)
> > + order = test_order;
>
> Well that is a significant change. In our current scheme the order
> boundart wins.

I think it's not a change. The existing function slab_order() starts with
min_order (unless it overshoots MAX_OBJS_PER_PAGE) and then goes upwards.
My code does the same - my code tests for MAX_OBJS_PER_PAGE (and bails out
if we would overshoot it) and increases the order until it reaches
slub_min_order (and then increases it even more if it satisfies the other
conditions).

If you believe that it behaves differently, please describe the situation
in detail.

> > +
> > + /* If we are below min_objects and slub_max_order, increase order */
> > + if (order_obj < min_objects && test_order <= slub_max_order)
> > + order = test_order;
> > +
> > + /* Increase order even more, but only if it reduces waste */
> > + if (test_order_obj <= 32 &&
>
> Where does the 32 come from?

It is to avoid extremely high order for extremely small slabs.

For example, see kmalloc-96.
10922 96-byte objects would fit into 1MiB
21845 96-byte objects would fit into 2MiB

The algorithm would recognize this one more object that fits into 2MiB
slab as "waste reduction" and increase the order to 2MiB - and we don't
want this.

So, the general reasoning is - if we have 32 objects in a slab, then it is
already considered that wasted space is reasonably low and we don't want
to increase the order more.

Currently, kmalloc-96 uses order-0 - that is reasonable (we already have
42 objects in 4k page, so we don't need to use higher order, even if it
wastes one-less object).

> > + test_order_obj > order_obj << (test_order - order))
>
> Add more () to make the condition better readable.
>
> > + order = test_order;
>
> Can we just call test_order order and avoid using the long variable names
> here? Variable names in functions are typically short.

You need two variables - "order" and "test_order".

"order" is the best order found so far and "test_order" is the order that
we are now testing. If "test_order" wastes less space than "order", we
assign order = test_order.

Mikulas