I pounded out a few patches to make my boxen run a little smoother.
This runs great on my big fat PAE boxen and on my craptop. It may
very well prove useful to others as well.
This will compile for i386 only, though I'm interested in merging the
fixes so other arches compile and so on.
To preemptively answer the question, I suspect a couple of these are
more than -mm cares to absorb at the moment. Of course, if that turns
out not to be the case, I'll send things in promptly.
Against virgin 2.5.71.
1: O(1) rmqueue_bulk()
rmqueue_bulk() currently does list walking and various kinds of
iteration every time in what's obviously a fast path. This
batches up prepped groups of pages in internal buddy allocator
lists so rmqueue_bulk() has O(1) expected time. free_pages_bulk()
is likewise trimmed down from O(group) to O(1) expected time.
2: trivial flow.c compilefix
The same thing everyone else has posted a dozen times.
3: lowmem_page_address() micro-optimization
Use page_to_pfn() instead of open-coding page_zone() etc.
so micro-optimized arch implementations of page_to_pfn()
can micro-optimize lowmem_page_address() in turn.
Shove i386 pmd's into highmem, brute-force. make -j bzImage now
incurs near-negligible lowmem pressure on my NUMA-Q. A very
comfortable feeling indeed. This was really a very mechanical
job, and it fits very smoothly into the core. This is the patch
that breaks non-i386 arches' compiles, though it's obvious how
to fix it, i.e. pmd_offset_map() etc. and pgd_page() changes.
5: trivial /proc/ BKL removals
Relatively unimportant, apart from the fact the BKL is annoying
and obscuring what's being locked in and around /proc/ due to
the BKL's inherent "wtf did they just lock" nature. There isn't
anything significant to audit around the specific codepaths
invoved, as one wrapped a call to a function that wrapped its
entire body in the BKL and the other wrapped a variable
(nr_threads) actually protected by the tasklist_lock, but
considered valid to access with no locking for reporting.
6: i386 pagetable cache
Use the tlb.h hooks to properly cache pre-zeroed pagetable and
pmd pages as well as to function properly with highpte/highpmd.
Slick implementation techniques make this O(1) in all cases,
with no list iterations, and no nonsense in general. One might
say this removes some nonsense, as they're trivially cacheable.
Use slab ctors to cache preconstructed pgd's. This is worth
more on non-PAE machines, as significant amounts of bitblitting
are incurred when the things are a whole page in size. A form
of this is already in -mm, but this version works with highpmd.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sun Jun 15 2003 - 22:00:41 EST