Re: Efficient IPC mechanism on Linux

From: Luca Veraldi
Date: Wed Sep 10 2003 - 04:37:09 EST


> yes a copy of a page is about 3000 to 4000 cycles on an x86 box in the
> uncached case. A pagetable operation (like the cpu setting the accessed
> or dirty bit) is in that same order I suspect (maybe half this, but not
> a lot less).

Probably you don't know what you're talking about.
I don't know where you studied computer architectures, but...
Let's answer.

To set the accessed or dirty bit you use

38 __asm__ __volatile__( LOCK_PREFIX
39 "btsl %1,%0"
40 :"=m" (ADDR)
41 :"Ir" (nr));

which is a ***SINGLE CLOCK CYCLE*** of cpu.
I don't think really that on any machine Firmware
a btsl will require 4000 cycles.
Neither on Intel x86.

> Changing pagetable content is even more because all the
> tlb's and internal cpu state will need to be flushed... which is also a
> microcode operation for the cpu.

Good. The same overhead you will find accessing a message
after a read form a pipe. There will occur many TLB faults.
And the same apply copying the message to the pipe.
Many many TLB faults.

> And it's deadly in an SMP environment.

You say "tlb's and internal cpu state will need to be flushed".
The other cpus in an SMP environment can continue to work, indipendently.
TLBs and cpu state registers are ***PER-CPU*** resorces.

Probably, it is worse the case of copying a memory page,
because you have to hold some global lock all the time.
This is deadly in an SMP environment,
because critical section lasts for thousand of cycles,
instead of simply a few.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/