RE: e1000 performance hack for ppc64 (Power4)

From: Herman Dierks (
Date: Fri Jun 13 2003 - 12:03:00 EST

I will let Anton respond to this. I think he may have tried this some
time back in his early prototypes to fix this.
I think the problem was not where the buffer started but where the packet
ended up within the buffer.
Due to varying sizes of TCP and IP headers the packet ended up at some
non-cache aligned address.
What we need for the DMA to work well is to have the final packet (with
datalink headers) starting on a cache line as its the final packet that
must be DMA'd. In fact it may need to to be aligned to a higher level than
that (not sure). on 06/13/2003 11:21:03 AM

To: Herman Dierks/Austin/IBM@IBMUS
cc: "Feldman, Scott" <>, David Gibson
       <>, Linux Kernel Mailing List
       <>, Anton Blanchard <>,
       Nancy J Milliner/Austin/IBM@IBMUS, Ricardo C
       Gonzalez/Austin/IBM@ibmus, Brian Twichell/Austin/IBM@IBMUS,
Subject: RE: e1000 performance hack for ppc64 (Power4)

Too long to quote:

Wouldn't you get most of the benefit from copying that stuff around in
the driver if you allocated the skb->data aligned in the first place?

There's already code to align them on CPU cache boundaries:
#define SKB_DATA_ALIGN(X) (((X) + (SMP_CACHE_BYTES - 1)) & \
                                 ~(SMP_CACHE_BYTES - 1))

So, do something like this:
#define SKB_DATA_ALIGN(X) (((X) + (ARCH_ALIGN_SKB - 1)) & \
                                 ~(SKB_ALIGN_BYTES - 1))

You could easily make this adaptive to no align on th arch size when the
request is bigger than that, just like in the e1000 patch you posted.

Dave Hansen

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to More majordomo info at Please read the FAQ at

This archive was generated by hypermail 2b29 : Sun Jun 15 2003 - 22:00:37 EST