The reason might be the K6 was first designed by NexGen, and I don't know,
if, at the beginning, they headed for decoding ix86 assembly instructions or
have their own assembly language (or any other, like m68k).
So while the execution units have undergone very careful design, the decode
unit might have been made with a shorter development time.
-- Kurt Garloff <kurt@garloff.de> [Dortmund, FRG] Plasma physics, high perf. computing [Linux-ix86,-axp, DUX] PGP key on http://www.garloff.de/kurt/ [Linux SCSI driver: DC390]
PS: I have been able to speed up a double(!) matrix-vector multiply by hand written assembly just avoiding cache line crossed insns and using the prefetch 3Dnow instruction to fill the cache before it's needed, by 30% on a AMDK6-2 ... (Compared to what egcs-2.91.5x -mpentiumpro -fschedule-insns2 -ffast-math -O3 produced.)
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/