Re: [arjanv@redhat.com: Re: [PATCH] shrink per_cpu_pages to fit 32byte cacheline]

From: Marcelo Tosatti
Date: Thu Sep 23 2004 - 19:18:20 EST

Next message: Andrew A.: "RE: Consistent kernel hang during heavy TCP connection handling load"
Previous message: Greg KH: "Re: Adding .class field to struct i2c_client (was Re: [PATCH][2.6] Add command function to struct i2c_adapter"
In reply to: Nakajima, Jun: "RE: [arjanv@redhat.com: Re: [PATCH] shrink per_cpu_pages to fit 32byte cacheline]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Sep 23, 2004 at 01:24:49PM -0700, Nakajima, Jun wrote:
> >From: Marcelo Tosatti [mailto:marcelo.tosatti@xxxxxxxxxxxx]
> >Sent: Thursday, September 23, 2004 7:12 AM
> >To: linux-kernel@xxxxxxxxxxxxxxx
> >Cc: Nakajima, Jun; akpm@xxxxxxxx; arjanv@xxxxxxxxxx; ak@xxxxxxx
> >Subject: [arjanv@xxxxxxxxxx: Re: [PATCH] shrink per_cpu_pages to fit
> 32byte
> >cacheline]
> >
> >
> >Forgot to CC linux-kernel, just in case someone else
> >can have useful information on this matter.
> >
> >Andi says any additional overhead will be in the noise
> >compared to cacheline saving benefit.
> >
> >***********
> >
> >Jun,
> >
> >We need some assistance here - you can probably help us.
> >
> >Within the Linux kernel we can benefit from changing some fields
> >of commonly accessed data structures to 16 bit instead of 32 bits,
> >given that the values for these fields never reach 2 ^ 16.
> >
> >Arjan warned me, however, that the prefix (in this case "data16") will
> >cause an additional extra cycle in instruction decoding, per message
> above.
>
> On the Pentium4 core, this is not a big deal because it runs out of the
> trace cache (i.e. decoded in advance). However, on the Pentium III/M
> (aka P6) core (i.e. Penitum III, Banias, Dothan, Yonah, etc.),
> especially when an operand size prefix (0x66) changes the # of bytes in
> an instruction (usually by impacting the size of an immediate in the
> instruction), the P6 core pays unnegligible penalty, slowing down
> decoding.

Jun,

What you mean by "unnegligible penalty" ?

You mean its very small penalty (unconsiderable), or its considerable penalty?

We are use one less cacheline for a very commonly used structure.

Thanks and sorry for poor english :)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Andrew A.: "RE: Consistent kernel hang during heavy TCP connection handling load"
Previous message: Greg KH: "Re: Adding .class field to struct i2c_client (was Re: [PATCH][2.6] Add command function to struct i2c_adapter"
In reply to: Nakajima, Jun: "RE: [arjanv@redhat.com: Re: [PATCH] shrink per_cpu_pages to fit 32byte cacheline]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]