COMPUTER ARITHMETIC SYNTHESIS ... - IEEE Xplore

implementation schemes over three independent dimensions. number system 2's complement, residual number system, redundant number system, logarithmic ...
20KB taille 3 téléchargements 404 vues
COMPUTER ARITHMETIC SYNTHESIS TECHNOLOGIES ON RECONFIGURABLE PLATFORMS K. H. Tsoi Computer Science and Engineering Department, The Chinese University of Hong Kong email: [email protected] In most custom hardware designs, including cryptographic systems, multimedia processing and scientific simulation, the computation involves a sequence of arithmetic operators. System performance can be significantly improved by carefully selecting a suitable implementation scheme for these operators. There are four major performance metrics to be considered when developing circuits that implement arithmetic operations, namely timing, area, power consumption and accuracy. To meet performance constraints, a developer may choose implementation schemes over three independent dimensions.

the process to adapt to new platforms or algorithms will be greatly simplified. To achieve the above objectives, our CAST system will have the following features: • The hardware components are represented as software objects and the design can be described in a conventional software language. • There is an unified constraint/attribute specification interface for users to set and modify the configuration of each object.

number system 2’s complement, residual number system, redundant number system, logarithmic number system and on-line arithmetics

• The system will provide a simulation function to verify the behavior of the design at the software level.

number format fixed/floating point number, size of integer/fractional part, radixes, etc.

• The system will include suitable constraint satisfaction and searching algorithms to optimize the design automatically according to the user inputs.

operator implementation pipeline stages, carry select v.s. carry look ahead, Booth’s algorithm, etc.

• The system will generate output circuits in a form which can be passed directly to synthesis tools.

The task of selecting suitable scheme for an arithmetic network according to the design constraints is difficult. Firstly, this task requires the developer to have both knowledge in computer arithmetic as well as experience with the target hardware platform. Secondly, there are too many combinations of possible choices considering that each operator can be adjusted individually, making it difficult to find the optimal combination of schemes even for an expert. Lastly, it is time consuming to repeat the process when the design is targeted to new platforms or new techniques are introduced. These design difficulties motivate our research in the Computer Arithmetic Synthesis Technology (CAST) system. The system will provide a new design methodology to easily search for near optimal implementation schemes. The system will capture computer arithmetic knowledge and target platform attributes as software libraries. The developer can then use the system to construct, evaluate and optimize the design. Using this new design methodology, the developer can concentrate on higher level system design and

• There are helper functions to maintain the correct interface between object such as rounding between different precision and converting between different number systems.

0-7803-9362-7/05/$20.00 ©2005 IEEE

713

• The CAST will also be able to adapt new algorithms and new hardware platforms efficiently without modifying the design source codes. Currently, the CAST system is implemented in C++ and the output circuit is in the form of structural, synthesizable VHDL code. The implemented arithmetic operators include ADD, SUB, MUL in fixed point, floating point and logarithmic number systems. Elementary functions are implemented using table lookups. The system has been successfully applied to several designs. First, we constructed a n-body force pipeline circuit [1]. Due to the large dynamic range of the input data, logarithmic system was used for some of the operators. A

hybrid structure mixing fixed point 2’s complement and logarithmic number was constructed to balance the resources and precision requirements. The CAST system was also used to develop a complex Monte Carlo simulation engine [3]. Different precision in both fixed and floating point formats were evaluated and a Nelder-Mead search was implemented to find the most suitable bit size of each operator. In the recent studies [2], a multiplier generator was added to the CAST system allowing users to chose different algorithms for implementing a parallel multiplier. Algorithms including Booth’s Recoding and TDM were available within CAST and the resulting multipliers were faster than existing commercial tools such as Xilinx XST and CoreGen. 1. REFERENCES [1] K. Tsoi, H. Y. C.H. Ho, and P. Leong, “An Arithmetic Library and its Application to the N-body Problem,” in IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), April 2004, pp. 68 – 78. [2] G. Zhang, P. Leong, C. Ho, K. Tsoi, C. Cheung, D. Lee, and W. Luk, “Monte Carlo Simulation using FPGAs,” submitted to IEEE Transactions on VLSI, 2004. [3] K. Tsoi and P. Leong, “Mullet - A Parallel Multiplier Generator,” submitted Proceedings of the International Workshop on Field Programmable Logic and Applications (FPL’05), 2005.

714