Re: [GIT PULL] scheduler fixes

From: Benjamin Herrenschmidt
Date: Mon May 25 2009 - 01:32:33 EST


On Sun, 2009-05-24 at 11:18 -0700, Linus Torvalds wrote:
> In fact, it would be nice to perhaps try to move it even earlier. Now you
> moved it to before the scheduler init (good!), but I do wonder if it could
> be moved up to even before the setup_per_cpu_areas() etc crud.
>
> I realize that the allocator wants to use the per-CPU area, but if we have
> just the boot CPU area set up statically at that point, since it's only
> the boot CPU running, maybe we could do those per-cpu area allocations
> without the bootmem allocator too?

Well, we want at least node information since we want per-cpu areas
to be allocated on the right node etc...

But then, bootmem has them, so we should be able to feed them off
to SL*B early.

One thing I'm wondering... Most archs I see have their own allocator
for before bootmem is available even. On PowerPC and Sparc, we call it
LMB and it's in fact in generic code now. x86 seems to have several
layers but thew e820 early allocator seems to fit a similar bill.

I wonder if we could try to shoot bootmem that way.

With a blend of Pekka's approach which can drastically reduce how much
we need bootmem, for the remaining bits such as the SL*B own data
structures and the mem_map, the arch is responsible to provide a simple
API to provide node local allocations that is roughly equivalent to
whatever bits of bootmem remain and are needed.

That API wraps on top of whatever the arch already has for early boot
stuff.

Finally, we can keep bootmem around in lib/ or such for archs that
don't want to convert or don't have an existing suitable early
allocator.

Cheers,
Ben.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/