Re: [benchmark] 1% performance overhead of paravirt_ops on nativekernels

From: Linus Torvalds
Date: Tue Jun 09 2009 - 11:08:32 EST




On Tue, 9 Jun 2009, Nick Piggin wrote:

> On Tue, Jun 09, 2009 at 01:17:19PM +0200, Ingo Molnar wrote:
> >
> > - The buddy allocator allocates top down, with highmem pages first.
> > So a lot of critical apps (the first ones started) will have
> > highmem footprint, and that shows up every time they use it for
> > file IO or other ops. kmap() overhead and more.
>
> Yeah this really sucks about it. OTOH, we have basically the same
> thing today with NUMA allocations and task placement.

It's not the buddy allocator. Each zone has it's own buddy list.

It's that we do the zones in order, and always start with the HIGHMEM
zone.

Which is quite reasonablefor most loads (if the page is only used as a
user mapping, we won't kmap it all that often), but it's bad for things
where we will actually want to touch it over and over again. Notably
filesystem caches that aren't just for user mappings.

> > Highmem simply enables a sucky piece of hardware so the code itself
> > has an intrinsic level of suckage, so to speak. There's not much to
> > be done about it but it's not a _big_ problem either: this type of
> > hw is moving fast out of the distro attention span.
>
> Yes but Linus really hated the code. I wonder whether it is
> generic code or x86 specific. OTOH with x86 you'd probably
> still have to support different page table formats, at least,
> so you couldn't rip it all out.

The arch-specific code really isn't that nasty. We have some silly
workarouds for doing 8-byte-at-a-time operations on x86-32 with cmpxchg8b
etc, but those are just odd small details.

If highmem was just a matter of arch details, I wouldn't mind it at all.

It's the generic code pollution I find annoying. It really does pollute a
lot of crap. Not just fs/ and mm/, but even drivers.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/