Download - Vision Based Motion Control
Vision Based Motion Control
Martin Jagersand
University of Alberta
CIRA 2001
Vision Based Motion Control
Martin Jagersand
University of Alberta
CIRA 2001
Content
1. Vision based motion control
2. Programming and solving whole human tasks
3. Software systems for vision and control
4. Discussion
1. How to go from Visual sensationto Motor action?>
Camera -> Robot coord Robot -> Object
Closed loop traditional visual servoing
This talk: focus on estimating the geometric transforms
EE
Lots of possible coordinates
Camera– Frame at projection center– Many different models
Robot– Base frame– End-effector frame– Object frame
Traditional modeling: P=P1(<params>) P2(<params>)… Pn(<params>)
Hand-Eye system
Motor-Visual function: y=f(x)Jacobian: J=( dfi / dxj )
Recall:Visual specifications
Point to Point task “error”:
yãE = yã à y0 y0
yã
y0
E =y1...y16
2
4
3
5
ã
ày1...y16
2
4
3
5
0
Why 16 elements?
Visual Servoing
Observed features: Motor variables: Local linear model: Visual servoing steps: 1 Solve:
2 Move:
y = y1 y2 . . . ym[ ]T
x = x1 x2. . . xn[ ]T
É y = J É x
yã à yk = J É x
xk+1 = xk + É x
Find J Method 1: Test movements along basis
Remember: J is unknown m by n matrix
Assume movements Finite difference:
J =@x1
@f1 ááá @xn
@f1
.... . .
@x1
@fm
@xn
@fm
0
B@
1
CA
É x1 = [1;0; . . .;0]T
É x2 = [0;1; . . .;0]T...É xn = [0;0; . . .;1]T
J t
...É y1...
2
4
3
5
...É y2...
2
4
3
5 ááá
...É yn...
2
4
3
5
0
@
1
A
Find J Method 2:Secant Constraints
Constraint along a line: Defines m equations Collect n arbitrary, but different measures y Solve for J
É y = J É x
ááá É yT1 ááá
â ã
ááá É yT2 ááá
â ã...
ááá É yTn ááá
â ã
0
BB@
1
CCA =
ááá É xT1 ááá
â ã
ááá É xT2 ááá
â ã...
ááá É xTn ááá
â ã
0
BB@
1
CCA J T
Find J Method 3:Recursive Secant Constraints
Based on initial J and one measure pair Adjust J s.t. Rank 1 update:
Consider rotated coordinates: – Update same as finite difference for n orthogonal
moves
É y;É x
É y = J k+1É x
Jêk+1 = Jêk +É xTÉ x
(É y à JêkÉ x)É xT
Trust region of J estimate
Let be the trust region at time t Define a model agreement:
Update the trust region recursively:
dt = É ymeasuredk kJêÉ xk k
ët+1 =21ët if dt ô dlower
ët if dlower < dt ô dupper
max(2É xt;ët) if dt > dupper
8<
:
Where dupper and are dlower predefined constants
ët
Visual Servoing Steps
1. Solve:
2. Update and move:
3. Read actual visual move
4. Update Jacobian:
yã à yk = J É x
xk+1 = xk + É x
Jêk+1 = Jêk +É xTÉ x
(É ym à JêkÉ x)É xTÉ ym
Visual Servoing Steps
1. Solve:
2. Update and move:
3. Read actual visual move
4. Update Jacobian:
yã à yk = J É x
xk+1 = xk + É x
Jêk+1 = Jêk +É xTÉ x
(É y à JêkÉ x)É xT
Jacobians = Spline model of underlying non-linear function
Over time acquires several Jacobians J
Each J a hyperplane Collection of J’s form
a (sparse) piecewise linear spline
Jacobian based visual model
Assume visual features m>>n motor freedoms
All visual change restricted to n freedoms by:
1. Can predict visual change
2. Can also parameterize x visually
É y = J É x
Related visual model:Affine model
Affine basis
Image projection of origin:
Image basis:
(e0; . . .;e3) = ((0;0;0);(0;0;1); . . .; (1;0;0))
P0(y) = y1
y2
ò ó
P1(y) = y3 y5 y7
y4 y6 y8
ò óà P0(y) 1 1 1( )
ô õ:
e1
e2
e3
O
Find affine coordinates
Observe (track) y through time Solve an equation system to find q
Reprojection: Have q,want y
p = P1(y)q + P0(y) = P(y;q)
p1 à P0(y1)...pw à P0(yw)
0
B@
1
CA =
P1(y1)...P1(yw)
0
B@
1
CA q
e1
e2
e3
Oq
Relation Affine – Jacobian image models
Rewrite affine model
p1x...p
2xm
p1y...p
2ym
0
BBBBBB@
1
CCCCCCA
=
qT1... 0
qT2m
qT1
0...
qT2m
0
BBBBBBB@
1
CCCCCCCA
y3 à y1
y5 à y1
y7 à y1
y4 à y2
y6 à y2
y8 à y2
0
BBBBBBB@
1
CCCCCCCA
+
y1...y1
y2...y2
0
BBBBB@
1
CCCCCA
Composite affine and Jacobian model
Chain the affine and Jacobian model Represents rigid objects in arbitrary motor
frame
p1x p1y...p
2xm p
2ym
0
@
1
A =1...1
!
y1 + J 0x; y2 + J 1x( ) +
+qT
1...qT
2m
0
@
1
A J 2::4x; J 5::7x( ) +y3; y4
y5; y6
y7; y8
0
@
1
A
0
@
1
A
Transforms Affine-Jacobian model
Measurement matrix
Affine coordinate equation:
M = M b
M d
ò ó= J tX + f (xt)(1; . . .;1)
M Td à
P0(y1)...P0(yw)
0
B@
1
CA (1;. . .;1) =
P1(y1)...P1(yw)
0
B@
1
CA Q
Experiment:Affine animation of rigid structure
Affine vs. Visual-Motor
Other sensory modalities: Force and contact manipulation
Accuracy is limited by: Visual tracking
and Visual goal specification Specifying well defined visual encodings can
be difficult Limited to non-occluded settings Not all tasks lend themselves to visual
specification.
Constraint Geometry
Impact force along surface normal:
Sliding motion:
3rd vector:
p1 = jfkjfk
p2 = jî xkjî xk
p3 = p1 â p2 = jfkjfk â
jî xkjî xk
Constraint Frame
With force frame = tool frame we get:
Assume frictionless => Can update each time step
Pk+1 =p1p2p3
0
@
1
A =
jfkjfk
jî xkjî xk
jfkjfk â
jî xkjî xk
0
BBB@
1
CCCA
P1
P2
P3
Hybrid Control Law
Let Q Joint -> Tool Jacobian Let S be a switching matrix, e.g. diag([0,1,1]) Velocity control u:
uk = à K vQà 1Pà 1k
SPkQJ à 1ek à K cQà 1Pà 1k
(I à S)Pkfk
Visual part Force part
Accounting for Friction
Friction force is along motion direction! Subtract out to recover surface normal:
Pk+1 =
jfkà pT
2fkp2j
fkà pT
2fkp2
jî xkjî xk
jfkà pT
2fkp2j
fkà pT
2fkp2
âjî xkjî xk
0
BBBBBB@
1
CCCCCCA
Motion Sequence
Motion Sequence
Summary of model estimation and visual motion control
Model estimation is on-line and requires no special calibration movements
Resulting Jacobians both model/constrain the visual situation and provide visual motor transf.
Motion control is direct from image based error functions to motor control. No 3D world space.
2. How to specify a visual task sequence?
1. Grasp
2. Move in
3. Cut
1. Grasp
2. Reach close
3. Align
4. Turn
Recall: Parallel Composition Example:
E (y ) =wrenchy - y4 7
y - y2 5
y • (y y )8 3 4
y • (y y )6 1 2
Visual error function “spelled out”:
Serial CompositionSolving whole real tasks
Task primitive/”link”
1. Acceptable initial (visual) conditions
2. Visual or Motor constraints to be maintained
3. Final desired condition Task =
A = (E init;M;E final)
A1A2...Ak
“Natural” primitive links
1. Transportation Coarse primitive for large movements <= 3DOF control of object centroid Robust to disturbances
2. Fine Manipulation– For high precision control of both position and
orientation– 6DOF control based on several object features
Example: Pick and place type of movement
3. Alignment??? To match transport final to fine manipulation initial conditions
More primitives
4. Guarded move– Move along some direction until an external
contraint (e.g. contact) is satisfied.
5. Open loop movements: When object is obscured Or ballistic fast movements Note can be done based on previously estimated
Jacobians
Solving the puzzle…
Teaching and Programming in Visual Space
1. Tele Assistance A tele-operator views the scene through stereo cameras Objects to be manipulated are pointed out on-line
2. Visual Programming Off-line Like xfig, macpaint, but with a palette of motor actions.
3. Teaching by Showing A (human) manipulation is tracked in visual space The tracked data is used to (automatically?) generate a
sequence of visual goals
HCI: Direct manipulationExample: xfig drawing program
Icons afford use Results visible Direct spatial action-
result mapping
line([10, 20],[30, 85]);patch([35, 22],[15, 35], C);
% C complex structuretext(70,30,'Kalle'); % Potentially add font, size, etc
matlab drawing:matlab drawing:
Example:Visual programming
Task control summary
Servoing alone does not solve whole tasks– Parallel composition: Stacking of visual constraints
to be simultaneously satisfied– Serial composition: Linking together several small
movements into a chain of continuous movements
Vision-based user interface– Tele-assistance– Visual Programming– Teach by showing
Types of robotic systems
Autonomy
Generality
Supervisory control
Tele-assistance
Programming by demonstration
Preprogrammed systems
3. Software systems for vision-based control
Hand-EyeSystem
System requirements
Solve many very different motion tasks– Flexible, teachable/re-programmable
Real time– On special embedded computers or general
workstations
Different special HW Multiprocessors
Toolbox
System design
Interpreted “scripting” language gives flexibility Compiled language needed for speed and HW
interface.
Examples Matlab
Greencard
Haskell
C, C++, fortran
PVM
C, C++
Dyn linking (mex)
Usage example:
Specialize robot– projandwait(zero3,’robotmovehill’,A3D,’WaitForHill’);
Initialize goals and trackers– [TrackCmd3D,N] = InitTrackers([1 1],[0,1]);– PU = GetGoals([1 1],[0,1]);
Servo control– J3s = LineMove(‘projandwait’,TrackCmd3D,J3i,PU,Ndi,err)
Software systems summary
Most current demos solve one specific movement
For solving many everyday tasks we need flexibility and reprogrammability– Compiled primitive visual trackng and– Interpreted scripting language– Higher order functions
Workshop conclusions
?
Workshop conclusions
Sensing is unreliable and incomplete– Can’t reliably build internal 3D world models, but can
use the real world as an external reference.
A-priori object and world models uncommon in human environments– Estimate on-line and only what’s needed.
Human users require human interaction techniques– Interacting by visual pointing and gestures is natural.
Action/Perception division in human and machine hand-eye syst.
Open questions?
For particular tasks what are the most natural representations and frames?
Global convergence of arbitrarily composed visual error functions?
Robustness? Interaction with other sensing modalities?
Feedback system
Fast internal feedback Slower external trajectory corrections
Short and long control loops
Applications for vision in User Interfaces
Interaction with machines and robots– Service robotics– Surgical robots– Emergency response
Interaction with software– A store or museum information kiosk
Service robots
Mobile manipulators, semi-autonomous
DIST TU Berlin KAIST
TORSO with 2 WAMs
Service tasks
This is completely hardwired! Found no real task on WWW
But
Maybe first applications in tasks humans can’t do?
Why is humanlike robotics so hard to achieve?
See human task:– Tracking motion, seeing gestures
Understand:– Motion understanding: Translate to correct
reference frame– High level task understanding?
Do: – Vision based control