[PATCH 0/2] incorrect cpumask behavior with CPUMASK_OFFSTACK

From: green
Date: Thu Feb 26 2015 - 01:20:11 EST


From: Oleg Drokin <green@xxxxxxxxxxxxxx>

I just got a report today from Tyson Whitehead <twhitehead@xxxxxxxxx>
that Lustre crashes when CPUMASK_OFFSTACK is enabled.

A little investigation revealed that this code:
cpumask_t mask;
...
cpumask_copy(&mask, topology_thread_cpumask(0));
weight = cpus_weight(mask);

that was supposed to calculate number of cpu siblings/partitions returns
a crazy high number over 3000 which is impossible as I only have
8 cpu cores on my system.

So after a bit of digging I found out that:
cpumask_copy only copies up to nr_cpumask_bits (actual number of cpus I
have in the system),
where as cpumask_weight actully tries to count bits up to NR_CPUS.

Not only calculating up to NR_CPUS is wasteful in this case, and
since we know how many cpus we have in the system - it only makes sense
to calculate only this much anyway, it's wrong because the copy only copied
8 bits to our variable and the rest of it is some random stack garbage.

So I propose two patches here, the first one I am more certain about -
operations that operate on current cpuset like cpus_weight, but also
cpus_empty, cpus_$LOGICALop cpus_$BINARYop are converted from NR_CPUS to
nr_cpumask_bits (this is ok when CONFIG_CPUMASK_OFFSTACK is not set as it's
then defined to NR_CPUS anyway).
I am leaving __cpus_setall __cpus_clear out of it as these two look like
they deal with entire set and it would be useful for them to operate on
all NR_CPUS bits for the case if more cpus are added later and such.

The second patch that I am not sure if we wnat, but it seems to be useful
until struct cpumask is fully dynamic is to convert what looks like
whole-set operations e.g. copies, namely:
cpumask_setall, cpumask_clear, cpumask_copy to always operate on NR_CPUS
bits to ensure there's no stale garbage left in the mask should the
cpu count increases later.

I checked the code and allocating cpumasks on stack is not all that
uncommon in the code, so this should be a worthwhile fix.

Please consider.

Oleg Drokin (2):
cpumask: Properly calculate cpumask values
cpumask: make whole cpumask operations like copy to work with NR_CPUS
bits

include/linux/cpumask.h | 37 +++++++++++++++++++++----------------
1 file changed, 21 insertions(+), 16 deletions(-)

--
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/