Re: [PATCH] per-zone kswapd process

From: William Lee Irwin III (wli@holomorphy.com)
Date: Thu Sep 12 2002 - 23:59:38 EST


On Thu, Sep 12, 2002 at 09:06:20PM -0700, Andrew Morton wrote:
> I still don't see why it's per zone and not per node. It seems strange
> that a wee little laptop would be running two kswapds?
> kswapd can get a ton of work done in the development VM and one per
> node would, I expect, suffice?

Machines without observable NUMA effects can benefit from it if it's
per-zone. It also follows that if there's more than one task doing this,
page replacement is less likely to block entirely. Last, but not least,
when I devised it, "per-zone" was the theme.

On Thu, Sep 12, 2002 at 09:06:20PM -0700, Andrew Morton wrote:
> Also, I'm wondering why the individual kernel threads don't have
> their affinity masks set to make them run on the CPUs to which the
> zone (or zones) are local?
> Isn't it the case that with this code you could end up with a kswapd
> on node 0 crunching on node 1's pages while a kswapd on node 1 crunches
> on node 0's pages?

Without some architecture-neutral method of topology detection, there's
no way to do this. A follow-up when it's there should fix it.

On Thu, Sep 12, 2002 at 09:06:20PM -0700, Andrew Morton wrote:
> If I'm not totally out to lunch on this, I'd have thought that a
> better approach would be
> int sys_kswapd(int nid)
> {
> return kernel_thread(kswapd, ...);
> }
> Userspace could then set up the CPU affinity based on some topology
> or config information and would then parent a kswapd instance. That
> kswapd instance would then be bound to the CPUs which were on the
> node identified by `nid'.
> Or something like that?

I'm very very scared of handing things like that to userspace, largely
because I don't trust userspace at all.

At this point, we need to enumerate nodes and provide a cpu to node
correspondence to userspace, and the kernel can obey, aside from the
question of "What do we do if we need to scan a node without a kswapd
started yet?". I think mbligh recently got the long-needed arch code in
for cpu to node... But I'm just not able to make the leap of faith that
memory detection is something that can ever comfortably be given to
userspace.

Cheers,
Bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Sep 15 2002 - 22:00:32 EST