Re: [MODSLAB 3/7] A Kmalloc subsystem

From: Christoph Lameter
Date: Fri Aug 18 2006 - 14:42:15 EST

On Sat, 19 Aug 2006, KAMEZAWA Hiroyuki wrote:

> At first, ia64's DISCONTIG is special because of VIRTUAL_MEMMAP.
> and ia64's SPARSEMEM is special,too. it's SPARSEMEM_EXTREME.

Right. Would it be possible to get VIRTUAL_MEMMAP support into SPARSEMEM?

> with FLATMEM, pfn_to_page() is pfn + mem_map. just an address calclation.

So the virt_to_page is as fast as IA64 DISCONTIG on UP and SMP.

> with *usual* DISCONTIG
> --
> pgdat = NODE_DATA(pfn_to_nid(pfn));
> page = pgdat->node_mem_map + pfn - pgdat->node_start_pfn
> --
> if accessing to pgdat is fast, cost will not be big problem.
> pfn_to_nid() is usually implemeted by calclation or table look up.

Hmmm.... pfn_to_nid usually involves a table lookup. So two table lookups
to get there.

> and usual SPARSEMEM, (not EXTREME)
> --
> page = mem_section[(pfn >> SECTION_SHIFT)].mem_map + pfn
> --
> need one table look up. maybe not very big.

Bigger than a cacheline that can be kept in the cache? On a large Altix we
may have 1k nodes meaning up to 4k zones!

> --
> page = mem_section[(pfn >> SECTION_SHIFT)][(pfn & MASK)].mem_map + pfn
> --
> need one (big)table look up.

Owww... Cache issues.

Could we do the lookup using a sparse virtually mapped table like on
IA64. Then align section shift to whatever page table is in place (on
platforms that require page tables and IA64 could continue to use its
special handler)?

Then page could be reached via

page = vmem_map + pfn

again ?

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at