Finding Extrinsic Parameters for Pan-Tilt-Slide System
1. Introduction
These days, I am developing a pan-tilt-slide motorized system equipped with stereo vision sensor.
For the motion system, I chose Edelkrone (the below video is an example bundle)
I came across the fact that they provide SDK (opens in a new tab) on Windows (now LInux is also supported) + python,
and decided to purchase a bundle with pan + tilt + slide. Let us see how it looks like (left picture):
The bundle of the above picture includes:
HeadPLUS v2 (pan and tilt)
SliderPLUS v5 PRO (sliding)
As can be seen, I mounted ZED2 camera instead of DSLR :)
Robot arm - camera system
The motion system is a robotarm-camera system,
where each joint along with the slider has a state as visualized in the right of the above picture.
Before proceeding, I define notations for the ease of our discussion. For consistency, the motor rotation axis is set to z axis of its local frame.
q(t)=(dsβ(t),ΞΈpβ(t),ΞΈtβ(t))βR3 : the state of the edelkrone (slide, pan, tilt) at a time t
Topβ(t)βSE(3): the transformation from reference frame O to the pan base frame P after dsβ(t) and ΞΈpβ(t) are set. The frame O is defined as the frame when the sliding base is unmoved (see O of the above figure). In the current setup,
The rotational part of Topβ(t) is Ropβ(t)=rotzβ(ΞΈpβ(t)),
and the translational part is topβ(t)=[0ββdsβ(t)β0β].
Thus, this Topβ(t) fully defined by the state q(t).
Tptβ(t)βSE(3): the transformation from the pan frame P to the tilt frame T.
Rotational part is composed of the following product:
This part is also fully defined by the motor angle ΞΈpβ(t). Note that the image shows when pan = 90 deg.
Translation part is written as tptββR3, BUT THIS IS UNKNOWN.
TtcββSE(3): transformation from T to the camera base frame C. As said previously, the camera is ZED 2 and we will use the following camera coordinate:
Rotation part is
Rtcβ(t)=β100β001β0β10ββ
Let us write the translational part as ttcββR3. In a similar manner with tptβ, ttcβ is also: THIS IS UNKNOWN.
ΞTcβ(t)βSE(3) is the relative pose of camera C at time t with respect to the frame C at the initial time Ο=0. As can be derived,
The zed camera supports the visual odometry and estimates ΞTcβ(t), and the encoder value q(t) can be easily obtained from SDK. In this case, I want forward kinematics and the inverse kinematics, which means the mapping between q(t)βΞTcβ(t) (more commonly, kinematics refers q(t) to Tocβ(t), but this is not my interest).
The rotational part of ΞTcβ(t) is not a problem, as it can readily derived from q(t). But the translation tocβ(q(t),tptβ,ttcβ) cannot be computed from q(t) as tptβ and ttcβ are unknown. I ambitiously aim to estimate the translational parts using dataset composed of the pair (ΞTcβ(t),q(t)).
2. Method
Arranging the unknowns
First, let us adopt a simple notation to denote the rotational and translational part of Tocβ(t) as Rocβ(t)βSE(3) and tocβ(t)βR3. Then, the followings hold:
For the derivation, the translation Ξtcβ(t) can be arranged into the following form:
Ξtcβ(t)=A(q(t))[tptβttcββ]+b(q(t))
, where A(t)βR3Γ6 and b(t)βR3. The A(t) and b(t) are fully determined by q(t). Also, An Ξtcβ(t) can be obtained by ZED VO (although it is an estimation..)
Finding extrinsic parameters tptβ and ttcβ
Assuming we have gathered N data points (q(tnβ),Ξtcβ(t))n=1Nβ, the parameters tptβ and ttcβ are found as a minimizer of
We might want to solve this by M\y in MATLAB, which is equivalent to the solution of Ξ£n=1Nββ₯Ξtc,nββb(qnβ)βA(qnβ)[tptβttcββ]β₯2. However, the matrix M has a rank deficiency (rank = 4). Why? We have to investigate how the extrinsic affects the translation Ξtcβ(t).
Singularity
Singularity of the translation Ξtcβ(t) occurs along two directions: understandably, the length along z axis of tptβ cannot affect the translational difference. Also, the combination of +x axis of tptβ and +z axis of ttcβ is zero. The combined axis does not contribute to the develop of Ξtcβ(t). Instead, +x axis of tptβ and βz axis of ttcβ affects as a radial length attached to the panning motor.
This means that we cannot determine the extrinsic which belongs to the nullspace of M. If we compute the nullspace with MATLAB by ns = null(M), the result is :
The above basis can be rearranged into more interpretable nullsapce vectors n1β and n2β: n1β=[1ββ0ββ0ββ0ββ0ββ1β]T and n2β=[0ββ0ββ1ββ0ββ0ββ0β]T.
Using them, we can write any vector in the nullspace as xnβ=w1nβn1β+w2nβn2β, and they do not contribute to the develop of Ξtcβ(t). as Mxnβ=0.
Finding extrinsic
We perform a singular value decomposition for M to distinguish the axes which affect Ξtcβ(t) from those do not. Let us write the basis of range space of rank = 4 as V=[v1βββv2βββv3βββv4ββ] where vnββR6. The basis can be readily determined by [U,S,V]=svd(M) and taking V = V(:,1:4). Now, we express a vector in the range space as xvβ=V[w1vβββw2vβββw3vβββw4vββ]T=Vwvβ.
Returning to the extrinsic parameters, they are written as the sum of xvβ and xnβ:
[tptβttcββ]=xvβ+xnβ
Here, our aim is to find wvβ(n=1,..,4) to determine xvβ (we cannot compute xnβ ). the above equation of interest y=M[tptβttcββ] can be arranged into
y=M(xvβ+xnβ)=Mxvβ=MVwvβ=Mvβwvβ
Taking the sudo-inverse from M_v \ y in MATALB, wvβ and xvβ is computed. In my implementation,
Now, we can compute Ξtcβ(t)_, once we have q(t) and the two unknowns w1nβ and w2nβ do not appear in the resultant calculation.
3. Result
Here, we compare the 1) zed visual odometry zedand 2) the forward-kinematics delta from computing A(q(t))[tptβββttcββ]T+b(q(t)) given the Edelkrone state q(t) and computed extrinsics.
I tested two test cases which were not included in computing extrinsic.
Case 1 (simple one-way moving)
The below is the pose history of the zed camera (x-forwarding = optical axis).