![]() |
|||
![]() |
IMADA - Department of Mathematics and Computer Science |
The ALU is the core of a microprocessor. We survey the foundations of floating point addition and multiplication algorithms that allow hardware pipelined implementations of these operations to be initiated every cycle. We describe the nature of the bottleneck resulting in floating point division and square root operation implementations taking 20-30 cycles on most microprocessors. We describe the fundamentally different algorithmic architectures for the Pentium and Athlon families' division and square root operations, and a third alternative employed in the Cyrix processors of the early 90's. We describe the SIMD floating point instructions SSE for the x86 family and Alta-vec for the Power PC G-4 employed by Apple. We indicate promising results for fast low power implementations of these instructions. We include some details on these instructions as realized in the GEODE one-watt x86 fl. pt. unit powering the recently announced AMD 50 by 15 systems. IN summary: Of all the movements contributing to the symphony of the pipelined RISC microprocessor, the concluding movement of fast floating point divide and square root remains unfinished. Host: Peter Kornerup SDU HOME | IMADA HOME | Previous Page Last modified: November 4, 2004. Joan Boyar (joan@imada.sdu.dk) |