phys_p = (uchar *) virt_to_phys(putp) ?????

Edward Welbon (welbon@bga.com)
Sun, 8 Sep 1996 23:40:51 -0500 (CDT)


In trying to see how to get the physical address of a given program
address I found the subject definition in ./drivers/net/dgrs.c Do you have
clue as to how this works? What I believe to be the pertinent code is:

[excerpted froom ./drivers/net/dgrs.c]

putp = p = skb_put(skb, len);

[deletia]

phys_p = (uchar *) virt_to_phys(putp)

[end excerpt]

The function virt_to_phys is defined in include/io.h

[excerpt from include/io.h]

/*
* Change virtual addresses to physical addresses and vv.
* These are trivial on the 1:1 Linux/i386 mapping (but if we ever
* make the kernel segment mapped at 0, we need to do translation
* on the i386 as well)
*/
extern inline unsigned long virt_to_phys(volatile void * address)
{
return (unsigned long) address;
}

[end excerpt]

The value putp whose physical addres is to be determined is defined by

[excerpt from include/linux/skbuff.h]

extern __inline__ unsigned char *skb_put(struct sk_buff *skb, int len)
{
unsigned char *tmp=skb->tail;
skb->tail+=len;
skb->len+=len;
if(skb->tail>skb->end)
panic("skput:over: %p:%d",__builtin_return_address(0),len);
return tmp;
}

[end excerpt]

This last function seems to just assume that the program (linear?) address
is equal to (at least) the low order 32 bits of the virtual address which
in turn is equal to physical address? Is this true only of the kernel
sapce? What about user space?

It seems that if I have a program address, and I want the physical address
(in order to know about the page numbering for an L2 cache test) can I
just look at the first eleven bits above the the cache line boundary of
the program address? Or do I need to get a virtual address first (how
thoug, they are bigger than 32 bits right?).

My argument is that since the P6 L2 is LRU, physically indexed, 4 way set
associative and has a total size of 256Kbytes we have that each set
contains 64KB or 2K 32 byte lines. So if want to be sure that a given
group of five page addresses map into the same L2 associativity set, I
think need to be sure that bits 5:15 are the same for 5 different real
pages.

in binary:

33222222222211111111110000000000
10097654321098765432109876543210 bit number of address
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX address
^^^^^ L2 cache line offset (0:4)
^^^^^^^^^^^ L2 cache index portion? (5:15)
^^^^^^^^^^^^^^^^ L2 cache line tag (16:31)
^^^^^^^^^^^^ Physical 4K page offset (0:11)
^^^^^^^^^^^^^^^^^^^^ Physical 4K page number (12:31)

If I can get this assurance, then if the word of each of the five pages
points at the next in a an endless ring then a reference such as p=*p
should force the four way cache to push out the line referenced four loads
ago will always miss (since the P6 L2 is physically addresses. two way set
associative with 32 byte lines the L2 will always miss).

For the purpose of the test, it is important that pointer dereferencing is
blocking dependency and will cause the loads to all serialize (you cant
execute faster than one load every three cycles in the case of L1 hits on
P6, the previous dereferencing load must complete before the address for
the next dereferencing load will be known, hence the loads must serialize
(with out dependencies they need not - at least on 604 they will not).

I think this will have the best chance of forcing the peak memory
bandwidth since it is pure load miss with out storebacks (assume no page
faults and neglecting infrequent interrupts on an otherwise idle machine
in UP mode).

What worries me is that there does not need to be any particular
correspondence between physical and program (linear?) addresses, The pages
can be mapped in any way. So bits (12:15) of the program address need
not agree with the physical address. This will hose up the L2 cache test.

I don't see how the subject code would work unless the kernel runs linear
= virtual = real. So what do I to get user physical addresses?

Ed Welbon; welbon@bga.com;