Re: [PATCH] x86: Add optimized popcnt variants

From: Borislav Petkov
Date: Tue Feb 23 2010 - 12:54:53 EST


From: "H. Peter Anvin" <hpa@xxxxxxxxx>
Date: Tue, Feb 23, 2010 at 09:34:04AM -0800

> On 02/23/2010 07:58 AM, Borislav Petkov wrote:
> >
> >Hmm, we cannot do that with the current design since __arch_hweight64
> >is being inlined into every callsite and AFAICT we would have to build
> >every callsite with "-fcall-saved-rdi" which is clearly too much. The
> >explicit "=D" dummy constraint is straightforward, instead.
> >
>
> Uh... the -fcall-saved-rdi would go with all the other ones.
> Assuming it can actually work and that gcc doesn't choke on an
> inbound argument being saved.

Right, doh. Ok, just added it and it builds fine with a gcc (Gentoo
4.4.1 p1.0) 4.4.1. If you have suspicion that some older gcc versions
might choke on it, I could leave the "=D" dummy constraint in?

BTW, the current version screams

/usr/src/linux-2.6/arch/x86/include/asm/arch_hweight.h: In function â__arch_hweight64â:
/usr/src/linux-2.6/arch/x86/include/asm/arch_hweight.h:47: warning: unused variable âdummyâ

on x86-32. I'll send a fixed version in a second.

--
Regards/Gruss,
Boris.

-
Advanced Micro Devices, Inc.
Operating Systems Research Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/