Re: [GIT PATCH] x86,percpu: fix pageattr handling with remap allocator

From: Jan Beulich
Date: Fri May 15 2009 - 03:47:52 EST


>>> Tejun Heo <tj@xxxxxxxxxx> 14.05.09 17:55 >>>
>> In order to reduce the amount of work to do during lookup as well as
>> the chance of having a collision at all, wouldn't it be reasonable
>> to use as much of an allocated 2/4M page as possible rather than
>> returning whatever is left after a single CPU got its per-CPU memory
>> chunk from it? I.e. you'd return only those (few) pages that either
>> don't fit another CPU's chunk anymore or that are left after running
>> through all CPUs.
>>
>> Or is there some hidden requirement that each CPU's per-CPU area must
>> start on a PMD boundary?
>
>The whole point of doing the remapping is giving each CPU its own PMD
>mapping for perpcu area, so, yeah, that's the requirement. I don't
>think the requirement is hidden tho.

No, from looking at the code the requirement seems to only be that you
get memory allocated from the correct node and mapped by a large page.
There's nothing said why the final virtual address would need to be large
page aligned. I.e., with a slight modification to take the NUMA requirement
into account (I noticed I ignored that aspect after I had already sent that
mail), the previous suggestion would still appear usable to me.

>How hot is the cpa path? On my test systems, there were only few
>calls during init and then nothing. Does it become very hot if, for
>example, GEM is used? But I really don't think the log2 binary search
>overhead would be anything noticeable compared to TLB shootdown and
>all other stuff going on there.

I would view cutting down on that only as a nice side effect, not a primary
reason to do the change. The primary reason is this:

>> This would additionally address a potential problem on 32-bits -
>> currently, for a 32-CPU system you consume half of the vmalloc space
>> with PAE (on non-PAE you'd even exhaust it, but I think it's
>> unreasonable to expect a system having 32 CPUs to not need PAE).
>
>I recall having about the same conversation before. Looking up...
>
>-- QUOTE --
> Actually, I've been looking at the numbers and I'm not sure if the
> concern is valid. On x86_32, the practical number of maximum
> processors would be around 16 so it will end up 32M, which isn't
> nice and it would probably a good idea to introduce a parameter to
> select which allocator to use but still it's far from consuming all
> the VM area. On x86_64, the vmalloc area is obscenely large at 245,
> ie 32 terabytes. Even with 4096 processors, single chunk is measly
> 0.02%.

Just to note - there must be a reason we (SuSE/Novell) build our default
32-bit kernel with support for 128 CPUs, which now is simply broken.

> If it's a problem for other archs or extreme x86_32 configurations,
> we can add some safety measures but in general I don't think it is a
> problem.
>-- END OF QUOTE --
>
>So, yeah, if there are 32bit 32-way NUMA machines out there, it would
>be wise to skip remap allocator on such machines. Maybe we can
>implement a heuristic - something like "if vm area consumption goes
>over 25%, don't use remap".

Possibly, as a secondary consideration on top of the suggested reduction
of virtual address space consumption.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/