Re: [RFC] [PATCH] cpuset operations causes Badness at mm/slab.c:777warning

From: Paul Jackson
Date: Fri Jun 01 2007 - 16:44:21 EST


> I think this is a good example of why having to special-case kmalloc(0)
> is a bad idea. The original code was straightforward and, barring
> silliness, should be completely correct with npids==0. This new code
> does nothing other than make things more complex.

Perhaps so. Perhaps not.

Actually, the original cpuset pid_array_load() code was not correct,
and Christoph's work is smoking this out.

The original code assumed that it could access the first element of the
kmalloc'd array, before it checked if it could go even further.

But with the kmalloc call of:

kmalloc(npids * sizeof(pid_t), GFP_KERNEL);

it is impossible for kmalloc to know how much to allocate, to guarantee
such behaviour. If npids is zero, all kmalloc sees, in essence, is the
call:

kmalloc(0, GFP_KERNEL);

There is no way it can guess that I might still want to access the
first "size(pid_t)" bytes of whatever it returned.

If we had a different sort of kmalloc API, such as would be called with:

kmalloc(npids, sizeof(pid_t), GFP_KERNEL);

rather like the libc calloc() routine, having separate count and size
fields, then in theory, kmalloc could treat calls of the form:

kmalloc(0, sz, GFP_KERNEL); /* for sz > 0 */

as if they were actually:

kmalloc(1, sz, GFP_KERNEL);

and then my cpuset code might not have been broken. But, since
some of us can never remember whether it is the size or the count
that comes first in such an API, we would still have had such
bugs -- just fewer, for those who got those arguments backwards,
and finally tripped over this anyway.

That doesn't address the other likely broken code, elsewhere in
the kernel somewhere that happened to try to access the 2nd or
3rd element of such an empty array before noticing.

So ... while I started this reply agreeing with you, I end up still
agreeing with Christoph's changes here.

Thanks, Christoph, for finding a bug in the cpuset code - good work!

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxx> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/