FPGA IMPLEMENTATION OF AN AUGMENTED ... - Xun ZHANG

This research is being funded by Ministerio de Ciencia y Tecnologıa under grant TIC 2003-09557-C02-02. that, although patients consider the system useful for ...
56KB taille 2 téléchargements 399 vues
FPGA IMPLEMENTATION OF AN AUGMENTED REALITY APPLICATION FOR VISUALLY IMPAIRED PEOPLE F. Javier Toledo, J. Javier Mart´ınez, F. Javier Garrig´os, J. Manuel Ferr´andez Dpto. Electr´onica, Tecnolog´ıa de Computadoras y Proyectos, Universidad Polit´ecnica de Cartagena C/ Dr. Fleming s/n, 30202, Cartagena, SPAIN email: [email protected] 1. INTRODUCTION

that, although patients consider the system useful for navigating and obstacle avoiding, a specifically designed system to perform image processing and increase frame rate is necessary. Obviously, an effective improvement of the user’s environment perception requires real time processing. With this aim, a system based on the FPGA implementation of the Canny edge detection algorithm to extract contour information has been recently developed [3]. Now, we propose the use of CNNs, which can be tuned to produce the desired results and allows to increase the versatility of the system through the possibility of using different templates.

Augmented reality (AR) is a highly interdisciplinary field which has received increasing attention since late 90s. Basically, it consists of a combination of the real scene viewed by a user and a computer generated image, running in real time. So, AR allows the user to see the real world supplemented with, in general, some information considered as useful, enhancing the user’s perception and knowledge of the environment. Many areas of knowledge are involved in AR, including computer vision, image and signal processing, wearable computing and information visualization. The applications spread on a wide range, from entertainment to military purposes, including medical visualization, engineering design, manufacturing and, as proposed in this work, impairment aids. Benefits of reconfigurable hardware for AR have been explored by Luk et al. [1]. However, the wide majority of AR systems have been based so far on PCs or workstations. In our work, an FPGA-based AR application is developed for people affected by tunnel vision. This consists of a loss of peripheral vision, while retaining a high resolution central vision, associated mainly to several eye diseases such as retinitis pigmentosa and glaucoma. The loss of the peripheral visual field affects considerably the patient’s ability to localize objects or persons and navigate, and consequently, his relationship with people and the environment. The devices used to aid affected people are traditionally based on techniques for reducing optically the size of the objects. This minification process allows to see a wider field of view, but at the expense of lessening the high resolution central vision. To overcome this disadvantage, Peli et al. [2] have proposed an augmented reality-based method to enhance the user’s knowledge of the environment by superimposing on his own view the contour information of the image obtained from a camera, by means of a see-through Head Mounted Display (HMD). They draw the conclussion

2. CNN HARDWARE IMPLEMENTATION Since Chua and Yang proposed the Cellular Neural Network (CNN) in 1988, a wide field of research have spread on its applications and implementation. Nowadays, the advantages of the combination of the massive parallelism and the local interaction among cells are already well known, specifically for image processing. Regarding the hardware, VLSI implementation has been the most popular choice to exploit the CNNs’ structure benefits up to now, but the emerging of the Field Programmable Gate Array (FPGA) devices offers a very interesting alternative to ASICs [4]. The dynamics defining the behaviour of Chua-Yang CNN is given by the state equation 1 and by the activation function f (xij ) in eq. 2: C



k,l∈N r(ij)

Akl ykl +



Bkl ukl +Iij ,

k,l∈N r(ij)

(1) 1 yij = f (xij ) = (|xij + 1| − |xij − 1|) , (2) 2 where I, u, y and x denote input bias, input, output and state variable of each cell, respectively. B is the constant weights template for inputs feedback and A is the corresponding template for the outputs of neighbour cells. Neighbourhood distance r for cell (i, j) is given by N r(ij) function, where i and j denote the position of the cell in the network and k

This research is being funded by Ministerio de Ciencia y Tecnolog´ıa under grant TIC 2003-09557-C02-02.

0-7803-9362-7/05/$20.00 ©2005 IEEE

1 dxij = − xij + dt R

723

Fig. 2. Simulation of patient’s view through the HMD in an outdoor environment. A residual 10o field of view has been considered to simulate the tunnel vision effect.

Fig. 1. The proposed CNN discrete model. and l the position of the neighbour cell relative to the considered cell. Finally, non-linear activation function of output corresponds to piecewise linear operator (PWL). In order to obtain a discrete CNN model to be implemented on FPGA, four different approaches have been analyzed: approximation with the Euler method, with the differentia algorithm (TDA-backward), with the numeric integration algorithm (TIA-Tustin) and with the response-invariant algorithm (RIT-first order impulse). The first one presents a predicting behaviour, the second and third ones a delay in their responses, and the fourth one has its response on the ideal continuous output. Since simulations and temporal analysis carried out show that the Euler model presents the best behaviour, this approximation process has been adopted, and has led to the discrete model depicted in Fig. 1.

movement of the head and the mobility of the person. Its adjustable see-through capability allows to enhance the user’s vision, specifically in our work with the contour information extracted by the CNN. These characteristics become the Glasstron HMD into a suitable choice for our application. The camera generates a 384 × 288 pixels image at 50 frames per second, using a CMOS image sensor from OmniVision. A digital interface facilitates its configuration and inicialization, executed from the FPGA. Since the image from the camera must be properly minified after the processing in order to show a wider field of view to the patients in their residual central vision, a higher resolution camera is not considered useful. An example of the patient’s view through the HMD is shown in Fig 2. A 57.9o (H) × 45.7o (V) lens has been used. A wider field of view can be acquired by the camera with the appropiate lens.

3. SYSTEM DESCRIPTION

4. FUTURE WORKS

The proposed system consists basically of a camera to acquire images of the environment, a head mounting display to visualize the information that enhances the user’s vision, a Xilinx Virtex-II FPGA as processor and controller unit and SRAM memory to store information. For the CNN, we have adopted a pixel pipelined approach, where 5 stages are connected in cascade. Each stage is a discrete cellular neuron, whose model has been derived by approximating the continuous function. The connectivity among cells is restricted to a 3 × 3 neighbourhood. With the suitable templates, the CNN extracts the contour information to be superimposed on the patient’s view. Due to the large size, data between the different stages of the processing cannot be stored in the internal BlockRAM memory, and must be stored in external SRAM memory. The CNN and the interfaces with the camera, with the HMD and with SRAM memory have been developed in VHDL, synthesized with XST and implemented on a XC2V4000 FPGA, using Xilinx ISE 6.3i. The design occupies the 5.25% of the slices, the 0.34% of the flip-flops and the 75% of the multipliers. Our system is completed with a Sony Glasstron PLMS700E HMD and a monochrome digital output camera. The Sony Glasstron HMD is a binocular and high resolution device with SVGA/VGA input. Its design and light weight allow the use with correction glasses and make it easy the

Future works will focus on the CNN architecture, which must be optimized to make feasible a fully parallel implementation. The advantages of the dynamic reconfigurability will be exploited to increase the versatility. Additional features will also be added to the system, such as an user interface to customize different operation parameters. 5. REFERENCES [1] W. Luk, T. Lee, J. Rice, and P. Cheung, “Reconfigurable computing for augmented reality,” in Proc. IEEE Symp. FieldProgrammable Custom Computing Machines, 1999, pp. 136– 145. [2] F. Vargas-Mart´ın and E. Peli, “Augmented-view for restricted visual field: multiple device implementations,” Optometry and Vision Science, vol. 79, no. 11, pp. 715–723, 2002. [3] F. J. Toledo, J. J. Mart´ınez, F. J. Garrig´os, and J. M. Ferr´andez, “Reconfigurable hardware for an augmented reality application,” in Proc. 2nd SPIE Int. Conf. on bioengineered and Bioinspired Systems, vol. 5839, 2005, pp. 389–397. [4] J. J. Mart´ınez, F. J. Toledo, and J. M. Ferr´andez, “Implementation of a discrete cellular neuron model (DT-CNN) architecture on FPGA,” in Proc. 2nd SPIE Int. Conf. on bioengineered and Bioinspired Systems, vol. 5839, 2005, pp. 332–340.

724