Daniel Kho CK, Ahmad Fauzi MF and Lim SL
As the speed requirements of imaging and communications systems increase, the latency requirements of digital circuits also become stringent. Due to such tight latency or timing requirements, large-stage pipelined circuits need to be redesigned to meet the low-latency requirements. Most modern imaging and communications systems rely on digital signal processing (DSP) that compute complex mathematical operations. The emergence of powerful and low-cost field programmable gate array (FPGA) devices with hundreds of arithmetic multipliers has enabled many such DSP hardware applications, traditionally implemented only as software solutions. The reciprocal square root algorithm is a popular technique for computing square roots, used widely in many software applications. This paper shows how this algorithm can be implemented efficiently on hard ware, and is suitable for lowlatency mathematically-intensive applications. Using a low-cost FPGA device, the algorithm takes up less than 1000 look up-tables (LUTs), which on an Artix XC7A200T device, translates to less than 1% of all the LUT resources in the chip.
Comparte este artículo