Re: [PATCH] md: Combine two kmalloc() calls into one in sb_equal()

From: Al Viro
Date: Fri Dec 09 2016 - 16:30:39 EST


On Fri, Dec 09, 2016 at 11:05:14AM -0800, Joe Perches wrote:
> On Fri, 2016-12-09 at 19:30 +0100, SF Markus Elfring wrote:
> > From: Markus Elfring <elfring@xxxxxxxxxxxxxxxxxxxxx>
> > Date: Fri, 9 Dec 2016 19:09:13 +0100
> >
> > The function "kmalloc" was called in one case by the function "sb_equal"
> > without checking immediately if it failed.
> > This issue was detected by using the Coccinelle software.
> >
> > Perform the desired memory allocation (and release at the end)
> > by a single function call instead.
> >
> > Fixes: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 ("Linux-2.6.12-rc2")
>
> Making a change does not mean fixes.
>
> There's nothing particularly _wrong_ with the code as-is.
>
> 2 kmemdup calls might make the code more obvious.
>
> There's a small optimization possible in that only the
> first MB_SB_GENERIC_CONSTANT_WORDS of the struct are
> actually compared. Alloc and copy of both entire structs
> is inefficient and unnecessary.
>
> Perhaps something like the below would be marginally
> better/faster, but the whole thing is dubious.
>
> static int sb_equal(mdp_super_t *sb1, mdp_super_t *sb2)
> {
> int ret;
> void *tmp1, *tmp2;
>
> tmp1 = kmemdup(sb1, MD_SB_GENERIC_CONSTANT_WORDS * sizeof(__u32), GFP_KERNEL);
> tmp2 = kmemdup(sb2, MD_SB_GENERIC_CONSTANT_WORDS * sizeof(__u32), GFP_KERNEL);
>
> if (!tmp1 || !tmp2) {
> ret = 0;
> goto out;
> }
>
> /*
> * nr_disks is not constant
> */
> ((mdp_super_t *)tmp1)->nr_disks = 0;
> ((mdp_super_t *)tmp2)->nr_disks = 0;
>
> ret = memcmp(tmp1, tmp2, MD_SB_GENERIC_CONSTANT_WORDS * sizeof(__u32)) == 0;
>
> out:
> kfree(tmp1);
> kfree(tmp2);
> return ret;
> }

May I politely inquire if either of you has actually bothered to read the
code and figure out what it does? This is grotesque...

For really slow: we have two objects. We want to check if anything in the
128-byte chunks in their beginnings other than one 32bit field happens to be
different. For that we
* allocate two 128-byte pieces of memory
* *copy* our objects into those
* forcibly zero the field in question in both of those copies
* compare the fuckers
* free them

And you two are discussing whether it's better to combine allocations of those
copies into a single 256-byte allocation? Really? _IF_ it is a hot path,
the obvious optimization would be to avoid copying that crap in the first
place - simply by
return memcmp(sb1, sb2, offsetof(mdp_super_t, nr_disks)) ||
memcmp(&sb1->nr_disks + 1, &sb2->nr_disks + 1,
MD_SB_GENERIC_CONSTANT_WORDS * sizeof(__u32) -
offsetof(mdp_super_t, nr_disks) - 4);
If it is _not_ a hot path, why bother with it at all?