Pose reconstruction with an uncalibrated Computed

Jun 26, 2003 - 1 Introduction. Image guided medical and surgical procedures are fast growing in today's hospitals. Computed Tomography (CT) and MRI are ...
277KB taille 2 téléchargements 322 vues
IEEE International Conference on Computer Vision and Pattern Recognition, 22-26 June 2003, Madison, Wisconsin

Pose reconstruction with an uncalibrated Computed Tomography imaging device B. Maurin† , C. Doignon† , M. de Mathelin† , A. Gangi‡ LSIIT (UMR CNRS 7005) Strasbourg I University ENSPS, Bd. S. Brant, 67400 Illkirch, FRANCE [email protected]

‡ Department of Radiology B University Hospital of Strasbourg 67091 Strasbourg, FRANCE



Abstract

propose some new methods, but they do not propose a real model for the imaging device. Moreover, they did not estimate the amount of error they make in estimating the pose parameters.

In this paper, we address the problem of precisely recovering the 3-D pose of 3-D shape fiducials from images obtained by means of an uncalibrated Computed Tomography (CT) imaging device. The main goal in this work is to model and estimate the geometric transformation relating line fiducials to their projections in cross-sectional images. To do so, we propose techniques which solve the points to lines correspondence using closed-form and numerical algorithms. A geometric transformation with eight degrees of freedom (rotation, translation and anisotropic scaling) is used to model both a rigid-body transformation and a scaling transformation accounting for CT scan intrinsic parameters. Furthermore, an estimation of error bounds in space is given when image data are affected by noise. Real experiments show that the proposed method provides good results on a set of CT images from many viewpoints.

They use CT scan as the imaging device because it is the quite common and it offers the advantage to have small restrictions on the kind of material used inside the viewing space. However, as for MRI, images are generated at a lowresolution thus scaling and shearing distortions are found [2]. To overcome a part of this problem, which has a harmful effect on the 3-D localization, we propose an imaging model which involves orthographic projection. Most CT imaging devices use some proprietary algorithms to generate image slices and these algorithms usually are not in the public domain. So, we may consider this device as a black box and we try to estimate some of its characteristics. The method presented here for localizing solid objects (fiducials) uses two new algorithms to find the unknown 3-D position and orientation from a single two-dimensional image. The fiducial we try to localize is made of 3-D lines.

1 Introduction

A quasi-affine transformation with eight degrees of freedom (rotation, translation and anisotropic scaling) is used to model both a rigid-body transformation and a scaling transformation accounting for CT scan intrinsic parameters. Since the reference object is represented with a set of calibrated line fiducials (see fig. 1) and since the cross-sectional CT images provide points (spots), we have to solve for the lines to points correspondence.

Image guided medical and surgical procedures are fast growing in today’s hospitals. Computed Tomography (CT) and MRI are increasingly used for image guidance in clinical procedures (e.g. navigation, neurosurgery, percutaneous therapy, ...). Image-guided robots have appeared to help radiologists and surgeons in, e.g., reducing their tiredness, obtaining precise displacements and a better protection from X-Rays [1, 4, 10] . An issue for the guidance of a robot is the precise localization of the end effector in the intraoperative space. To do so, fiducials are placed on the robot or its end-effector. In [11], a method is exposed in order to find the position and orientation of fiducials and a visual servoing scheme is also proposed. Their work is based on the work of Brown in stereotaxy, who has initiated the solution for radiosurgery applications in [3]. Lee et al. [8]

The paper is organized as follows. In the next section, the object representation and the imaging device model are described. We detail some algorithms for estimating the 3-D pose and intrinsic characteristics in sections 3 and 4. Errors propagation through computations is analyzed in both sections and we give simulation results and real experiments in section 5. 1

D1

Q1

Q4 D5

Q2

Z

RO

D4 D2

X

Q6

x

D3

Q5

O2,5

Y

O1,4

D6

Q3

z

O3,6

Rπ (plane)

y

(a)

(b)

(c)

Figure 1. (a) Representation of line fiducials and a cutting plane, (b) the fiducial (c) image slice of the fiducial with the patient.

2 Problem statement

(3×2) matrix L = (l 1 , l2 ) which must satisfy the following quadratic constraints:

CT scan imager provides radio opaque slices of objects. It is the reason why fiducials usually used are composed of straigth lines ([2],[8]). Generally, intersection between an object lines and the cutting plane should provide as many spots as there are lines. However, several spots may be missing in the cross-sectional image or, in the contrary, some artefacts may accidently appear [8]. These spots are easily detectable but not always well-localized due to the plane tilt with respect to line directions (see detected spots in Figure 1.c).

2.1 The imaging device model To build a geometric model of the fiducial, let us denote by ∆i the ith line of this object. This line is defined with an origin Oi and a unit vector D i . Intersections of ∆i and the cutting plane (π) provide points Pi/Ro = ∆i ∩ π in the object reference frame (Ro ) : Pi/Ro = Oi + λi Di . The imaging model we proposed is based on an orthographic projection and it relates point coordinates Pi/π = (xi , yi , zi )T expressed in the CT scan reference frame (Rπ ) to image pixel coordinates Qi/π = (ui , vi )T as: 



 1 0 sx P/π =  0 1  0 0 0

0 sy

 

u v



= Π S Q/π (1)

where sx and sy are positive scaling factors (mm/pix). Since there is an euclidean transformation between the object reference frame (Ro ) and the CT scan reference frame (Rπ ), one has: Pi/Ro = R Pi/π + t where R is a rotation matrix and t is a translation. The product (R Π S) is a

l1

T

l1 = s2x ,

l2

T

l2 = s2y ,

T

l1

(2)

l2 = 0

Therefore, one must solve both for L, t and {λi } with :      ui   O i = l1 l 2 + t − λi D i  vi (3)    l 1 T l2 = 0 We can reformulate (3) as an optimization problem with equality constraints :  minX || A X − b ||2  (4)  X1 .X4 + X2 .X5 + X3 .X6 = 0

where all unknowns are stacked in a vector X :  T X = lT1 , lT2 , tT , λ1 , · · · , λn and:

n terms



   U1   A=  ..  .   

Un

V1

I3×3

.. .

.. .

Vn

I3×3

z

−D1

}|

0 0 0 ··· 0 0 0

.. 0 0 0 ··· 0 0 0

. −D n

{

           

T  with Ui = ui I3×3 , Vi = vi I3×3 , b = O T1 · · · , OTn and I3×3 is the (3 × 3) identity matrix. It worth pointing out that the condition l T1 l2 in (3) may hold for a shearing parameter. Since we consider that this is not very significant for CT images (but it is for MRI), the orthogonality constraint lT1 l2 = 0 must be satisfied.

3 Linear solution with a least-squares algorithm In this section, we are considering only linear equations in (4) and the quadratic constraint l T1 l2 = 0 will be introduced a posteriori. Considering n lines in the phantom, the previous constrained linear system (4) has (6 + 3 + n) unknowns and (3n + 1) equations (one of which is a quadratic one). Thus, to solve it, one must have 3n + 1 ≥ 6 + 3 + n, then n ≥ 4, but this may lead to several solutions. Considering only the set of linear equations, one must have n ≥ 92 , that we round up to 5 lines.

3.1 A linear five-points/line algorithm The idea behind the five-points algorithm is very simple. Assume that n correspondences have been established between image points and object lines. Each correspondence brings three linear equations for the (9 + n) entries of X. If at least five correspondences have been established (i.e., n ≥ 5), X can be determined as the solution of the system. Then, this linear system is solved by using the SVD. Once L, t and {λi } are estimated, the scaling p factors in S are computed respectively with s = l 1 T l1 x p T and sy = l2 l2 . The first two columns of the rotation matrix R are recovered by dividing columns of L by sx and sy respectively (the third vector column is the cross-product of the first two). Because of noise on the detection of image spot centers, numerical errors and inaccurate correspondences, R is usually not exactly orthonormal (RT R 6= I3×3 ). We can enforce the orthogonality constraint by computing the SVD of the ˆ = UΣVT and setting the singular estimated rotation R values equal to 1. If Σ0 is the corrected Σ matrix, the corrected estimate, R0 is given by R0 = UΣ0 VT [12, 6]. The main advantage to this method is that the robustness increases by addition of other image points provided noise on points location has a gaussian behaviour. The main drawback is that if an outlier is detected instead of a projection of an object line, the quadratic error will be significant and distributed on all estimated parameters.

(¯ u=

P

i

ui /n, v¯ =

i

vi /n) to P the √ origin and an isotropic 2

2

(ui −¯ u) +(vi −¯ v) i √ ), so scaling factor is chosen (d¯ = n 2 that the u and v coordinates of a point are equally scaled [7]. If the image point Q is expressed with homogeneous coorˆ = NQ dinates (u, v, 1)T , the normalized coordinates are Q with:   1/d¯ 0 −¯ u/d¯ 1/d¯ −¯ v /d¯  N= 0 0 0 1

Other weighting methods can be applied on A to make it better conditionned. But since they act on all elements of A, they don’t offer the possibility to separate true data (object model) from noisy data (image point). For instance, one can write: min k A X − b k2 ∼ = min k AG Y − b k Y

X

with rank(A) = rank(G) = n. If G is chosen such that G = diag(1/kAi,1 k2 , . . . , 1/kAi,1 kn ), thus the condition number κ2 (AG) is minimal (see [5] p. 264). During simulations, this weighting doesn’t give significant improvements on κ2 . From our practical experiments, it modifies κ2 by a factor of 1.5. On the other hand, a pre-normalization of the data with N provides a gain of 102 on κ2 , yielding the condition number to be κ2 ∼ = 102 (with or without use of matrix G).

3.3 Error bound for the least-squares Once the minimization problem has been solved, we propose to estimate a bound on the errors on X. To do so, let A∗ and A be matrices accounting for real and unknown (true) data, respectively. Let A = A∗ +δA, X = X ∗ +δX with    δA = 

∆U1 .. . ∆Un

∆V1 .. . ∆Vn

0 .. . 0

0 ..  .  0

Components of this matrix contain errors arising during features extraction and concerning the estimate of image points Qi location. Moreover, since we assumed there is no error on the object model, one has b unchanged. The least-square solution is such that: AT A X = A T b

3.2 Pre-normalization The previous resolution is very sensitive to the condition number of A, κ2 (A) = kAk2 kA−1 k2 = σσn1 (A) (A) , with σi th the i singular value of A [5]. If we assume that pixels are in the range [0, 512], then it results that A can contain values from 10−2 to 512, so a ratio of 105 may occur between its elements. To lower the condition number, it seems advisable to normalize data coordinates with an affine transformation N so as to bring the centroid of the set of all points

P

(5)

If we make a substitution with the previous definition and we keep only the first order terms, we have: A∗T A∗ δX = δAT b − (A∗T δA + δAT A∗ ) X ∗

(6)

This last equation is linear in δA and one can transform it to be: H(A∗ ) δX = M(A∗ , X ∗ ) δE



(7)

T

with δE = ∆u1 , ∆v1 · · · ∆un , ∆vn . In our case, H = A∗T A∗ is square, generaly not singular and we

have δX = H−1 M δE . A linear programming method is used to compute the minimum/maximum values of δX(i) using othogonal directions di : δX(i) = max δE

dTi H−1 M δE

(8)

By choosing di = (0, . . . , 0, 1, 0, . . . , 0), i.e., 1 at the ith position, (8) gives interval bounds of errors on δX(i).

4 Non linear minimization In this section, we consider the complete optimization problem (4). Knowing that the least squares solution cannot a priori take the orthogonality constraint into account, we look for an iterative solution to solve it.

4.1 Newton-Raphson with equality constraints To do so, we propose to use a Newton-Raphson method with equality constraints [9]. The problem is rewritten with Lagrangian multipliers in the following manner:   min{X,λ} L = f (X) + µ h(X) 

h(X) = 0

applied to (4):   L = k A X − b k2 + µ h(X) with

which can be put together as       ∆X ∇L ∇2 L ∇h . = − (∇h)T 0 j ∆µ j h j In this case, we have   2 AT A + 2 µ j C 2 C X j = [∇.F]Tj [∇.F]j = 2 (C X j )T 0

and finally : ∆Y j = −([∇.F]j )−1 . F (Y j ) The initialization of this algorithm can either be the solution of the least-squares problem, or a random value that gives a nonsingular matrix [∇.F]j=0 . Then the algorithm converges as long as [∇.F]j remains of full rank.

4.2 Error bounds for the Newton-Raphson algorithm In this section, we assume that convergence is realized, then we have: ∆X j = 0, X T CX = 0, µ = cste and let us define perturbation errors on A, X and µ such as in section 3.3, with µ = µ∗ + δµ. Equation (9) can be also rewritten as:       ∆X j ∇2 L ∇h ∇f = − (10) . (∇h)T 0 j h j µj+1 with ∇f = 2 AT (A X − b) from which one can obtain, after some computations (keeping only first order terms):

h(X) = X T C X





1 C=  2

03×3 I3×3 I3×3 03×3 0(3+n)×(6)

H



0(9+n)×(3+n)  0(3+n)×(3+n)

is a (9 + n) square constraint matrix. Then, the KuhnTucker conditions become:   ∇f + µ ∇h = 0 ∇L = 0 ⇐⇒ h(X) = 0 h(X) = 0 We then look for the solution to this multi-dimensional problem using an iterative Newton-like minimization. The process consists in finding the best direction of descent. Let be     ∇L X F = , Y = , F (Y ) = 0 h µ With the Newton method, at the j th iteration, we have Y j+1 = Y j + ∆Y j ,

(9)

[∇.F]Tj ∆Y j = −F (Y j )

z

(A 

∗T

}| {   A + µ∗ C) C X ∗ δX = δµ X ∗T C 0 ∗

δAT b − (A∗T δA + δAT A∗ ) X ∗ 0

(11)



We can observe that the right part of (11) looks like (6). We can then apply the same method as in (8) to obtain an error T  . bound on δX δµ

5 Experimental Results First of all, several simulations were conducted with Matlab software for the proposed method. This was implemented both with least-squares and Newton-Raphson minimization. The robustness was examined by introducing varying levels of gaussian noise on the input data and errors were reported with the true and estimated scaling factors (m/pix), as well as the {λi }’s coefficient (m). The root mean square errors of angles (in degree) and position values (m) of the euclidean transformation are shown

−3

x 10

−0.118

2

Tx(m)

rms on λ(m)

3

−0.119 −0.12

1

−0.121 0

0

0.5

1

1.5

2

2.5

−5

rms on S (m/pix)

x 10

Least Squares Newton−Raphson

2 1.5

0.18

1

1

1.5 2 Gaussian noise magnitude (in pixels)

2.5

Tz(m)

0.5

3

20

25

30

35

0

5

10

15

20

25

30

35

0

40

45

Least Squares Newton−Raphson

0

5

10

15

20

25

30

35

40

45

0

5

10

15

20

25

30

35

40

45

0

5

10

15

20

25

30

35

40

45

Image #

−176

1.5

Rx(deg)

1 0.5 0

0.5

1

1.5

2

2.5

−176.5 −177

3

−177.5 0.012

2

Least Squares Newton−Raphson

1

Ry(deg)

0.01 0.008 0.006 0.004

0 −1

0.002 0

0.5

1

1.5 2 Gaussian noise magnitude (in pixels)

2.5

3

3

Figure 2. Reported rms errors on intrinsic parameters S and on 3-D parameters {λi }’s, rotation angles and position vector, for a range of gaussian noise levels.

Rz(deg)

0

45

0.01

2

0

40

0.02

2.5 rms on R(deg)

15

0.03 0

3

rms on T(m)

10

0.04

0.5 0

5

Ty(m)

2.5

0

3

Least Squares Newton−Raphson

2 1 0

0

5

10

15

20

Image #

25

30

35

40

45

Figure 3. Positions and rotation angles all along the CT images sequence with the two implemented methods.

in figure 2. As it might be expected, the Newton-Raphson method always provides better results than the least squares method. Pose errors are for the order of 1 mm of position and 0.5 degree for rotation for half a pixel of inaccuracy in the image spot location. Real experiments were performed with a Siemens Somatom Plus CT scanner with a fiducial composed of six straight line rods as described in section 2. The fiducial has been calibrated by using a Mitutuyo mesuring machine and a precision of 1 µm is garanteed for the model. Briefly, the image segmentation consists in locating the area of interest surrounding each blob with an adaptive image binarization. The location of each blob is then estimated by performing an edge detection (Canny’s detector) inside each area of interest and by computing the centroid of detected edges. The correspondence problem is assumed to be solved for recovering the position and the orientation of the phantom (see figure 3). The least-squares method and Newton-Raphson minimization method give similar results either for extrinsic parame-

ters (position and orientation) and for intrinsic scaling factors (see figure 4). In practice, only 2 iterations are needed to obtain the convergence, even if this procedure is started with a random value. This property is due to the quadratic expression of (9). A planarity constraint is also computed with the coordinates of the recovered 3-D points {Pi } (related to the recovered values of {λi }) as the square root of the smallest positive eigenvalue of the scattered matrix P T i Pi Pi . Since its value is very small (' 0.2 mm), one can conclude that the points {Pi } are almost coplanar. Finally, we present the estimated distance between two cubes. Each cube has been estimated separately and once the pose is found for both, we compute the relative translation. A constant translation of -0.118296 m along the Y axis was mesured during calibration. On figure 5, we see how the recontruction evolves along the Z axis (or frame number). Hence, we can say that the best reconstruction is done when the projections of rods are not collinear.

−3

−4

x 10

Tx(m)

4.75 4.7 0 −4 x 10 4.76

5

10

15

20

25

30

35

40

x 10

0 −5

45 −10

0

5

10

15

20

25

30

0

5

10

15

20

25

30

0

5

10

15 Frame #

20

4.75 −0.115

4.74

Recons. Err.(pix)

4.73

0

5

10

15

20

25

30

35

40

45

1.5

Ty(m)

Sy(m)

5

−0.12

1 0.5 0

Planarity

4

0.01

0 −4 x 10

5

10

15

20

25

30

35

40

45 Tz(m)

Sx(m)

4.8

Least Squares Newton−Raphson

2 0

0 −0.01 −0.02

0

5

10

15

20

Image #

25

30

35

40

45

[3] [4]

6 Conclusion We have presented in this paper an imaging model and some algorithms for estimating the geometric transformation which occurs between a set of object line fiducials and the set of CT image points as orthographic projections of these lines. Two algorithms have been presented, the first one provides an analytical solution but cannot handle the orthogonality constraint. The second one is an iterative Newton-like algorithm with fast convergence. Furthermore, a sensitivity analysis is given for both algorithms. This approach makes it possible to estimate on-line the intrinsic parameters of a CT imaging device as well as the viewpoint.

[5] [6]

[7] [8]

[9]

7 Acknowledgements

[10]

The authors wish to thank the Alsace Region for the support of this research work.

[11]

References [1] P. Abolmaesumi, S. E. Salcudean, W.-H. Zhu, M. R. Sirouspour, and S. DiMaio. Image-guided control of a robot for medical ultrasound. IEEE Transactions on Robotics and Automation, 18(1):11–23, February 2002. [2] M. Breeuwer, W. Zylka, J. Wadley, and A. Falk. Detection and correction of geometric distortion in 3D CT/MR

25

30

Figure 5. Estimated translation between two cubes.

Figure 4. (top) scale factors found compared to constant values provided by the scan (solid red line), - (middle) rms errors between corresponding image blob centers and adjusted data in the model under the transformation found, (bottom) the planarity criterion of the 3-D points (as intersections of line fiducials and the cutting plane) computed with the transformation (R and t) and the {λi }’s found.

Least Squares Newton−Raphson

[12]

images. In Proceedings of Computer Assisted Radiology and Surgery, pages 11–23, Paris, June 23-26 2002. CARS’99. R. A. Brown, T. S. Roberts, and A. G. Osborne. Stereotactic frame and computer software for CT-directed neurosurgical localization. Invest. Radiol., 15:308–312, 1980. G. Fichtinger, T. L. DeWeese, A. Patriciu, A. Tanacs, D. Mazilu, J. H. Anderson, K. Masamume, R. H. Taylor, and D. Stoianovici. Robotically assisted prostate biopsy and therapy with intra-operative CT guidance. Journal of Academic Radiology, 9:60–74, 2002. G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins Univ. Press, third edition edition, 1996. D. Goryn and S. Hein. On the estimation of rigid body rotation from noisy data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(12):1219–1220, December 1995. R. I. Hartley. In defense of the eight-point algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(6):580–593, June 1997. S. Lee, G. Fichtinger, and G. S. Chirikjian. Numerical algorithms for spatial registration of line fiducials from crosssectional images. American Association of Physicists in Medicine, 29(8):1881–1891, August 2002. S. S. Rao. Engineering Optimization Theory and Practice. Wiley Interscience Publication. John Wiley&Sons, third edition edition, 1996. M. Shi, H. Liu, and G. Tao. A stereo-fluoroscopic imageguided robotic biopsy scheme. IEEE Transactions on Control Systems Technology, 10(3):309–317, May 2002. R. C. Susil, J. H. Anderson, and R. H. Taylor. A single image registration method for ct guided interventions. In Proceedings of the Second International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 798–808, Cambridge, UK, September 19-22 1999. MICCAI 99. S. Umeyama. Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(4):376– 380, April 1991.