Re: NUMA node information for pages

From: David Rientjes
Date: Wed Apr 09 2014 - 20:41:36 EST


On Tue, 8 Apr 2014, Naoya Horiguchi wrote:

> memory hotplug is done in memory block basis, so if we get info from under
> /sys/devices/system/memory/memory<ID> it should be memory hotplug-aware
> (/sys/devices/system/memory/memory<ID>/state shows online/offline status.)
>
> And IIUC, "pfn-node_id" mapping might be already available for userspace.
> /sys/devices/system/memory/block_size_bytes exports memory block size,
> so we can simply map pfn (physical address) into memory block ID by
> (physicall address)/(memory block size), then we can find associated node
> from /sys/devices/system/memory/memory<ID>
>
> $ ls -l /sys/devices/system/memory/memory0
> ...
> lrwxrwxrwx 1 root root 0 Apr 8 00:15 node0 -> ../../node/node0
>

That's only possible with sparsemem and if you have memory hotplug
enabled. I'm thinking that Ulrich is looking for a solution that won't
have such a dependency and work for all memory models (including one that
disables NUMA and simply represents all memory as one big node).

[ And that block_size_bytes file is absolutely horrid, why are we
exporting all this information in hex and not telling anybody? ]

I'd much prefer a single change that works for everybody and userspace can
rely on exporting accurate information as long as sysfs is mounted, and
not even need to rely on getpagesize() to convert from pfn to physical
address: just simple {start,end}_phys_addr files added to
/sys/devices/system/node/nodeN/ for node N. Online information can
already be parsed for these ranges from /sys/devices/system/node/online.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/