Re: [PATCH] Use MPOL_INTERLEAVE for tmpfs files

From: Martin J. Bligh
Date: Tue Nov 02 2004 - 12:11:29 EST


--Andi Kleen <ak@xxxxxxx> wrote (on Tuesday, November 02, 2004 16:55:07 +0100):

> On Tue, Nov 02, 2004 at 07:46:59AM -0800, Martin J. Bligh wrote:
>> > This patch causes memory allocation for tmpfs files to be distributed
>> > evenly across NUMA machines. In most circumstances today, tmpfs files
>> > will be allocated on the same node as the task writing to the file.
>> > In many cases, particularly when large files are created, or a large
>> > number of files are created by a single task, this leads to a severe
>> > imbalance in free memory amongst nodes. This patch corrects that
>> > situation.
>>
>> Yeah, but it also ruins your locality of reference (in a NUMA sense).
>> Not convinced that's a good idea. You're guaranteeing universally consistent
>> worse-case performance for everyone. And you're only looking at a situation
>> where there's one allocator on the system, and that's imbalanced.
>>
>> You WANT your data to be local. That's the whole idea.
>
> I think it depends on how you use tmpfs. When you use it for read/write
> it's a good idea because you likely don't care about a bit of additional
> latency and it's better to not fill up your local nodes with temporary
> files.
>
> If you use it with mmap then you likely want local policy.
>
> But that's a big ugly to distingush, that is why I suggested the sysctl.

As long as it defaults to off, I guess I don't really care. Though I'm still
wholly unconvinced it makes much sense. I think we're just papering over the
underlying problem - that we don't do good balancing between nodes under
mem pressure.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/