robot vision: multi-sensor reconnaissance. overview an individual robot can develop an...
TRANSCRIPT
Robot Vision:Multi-sensor Reconnaissance
Overview
• An individual robot can develop an interpretation about its environment.
• Groups of robots can combine their resources to develop an improved, collective interpretation.
• Such a “swarm” is reliant on centralized processing to refine the collective interpretation.
• The swarm has more insight than solitary individuals.
Single-sensor Vision
• Our setup involves numerous robots, each with a single video camera.
• What can a single-camera robot see?
• A background image may be used to establish a baseline of irrelevant imagery.
• Anything that differs from the background is considered relevant – background subtraction.
minus equals
Making Sense of Pixel Data• Background subtraction and classical image processing algorithms can
mark every pixel in every frame as foreground (interesting) or background (uninteresting).
• Conventional techniques can clean noise.
• Regions of contiguous foreground pixels are likely to constitute a single object: a “blob.”
• This assumption is false when one object occludes another.• Later we will see a way around this.
Developing the Individual’s Interpretation
• Pixel-space blobs have measurable properties: color, shape, and location.
• Geometric properties (location, size) are relative to the camera’s perspective.
• A single robot can provide a list of objects with a bearing to each, but has no depth perception.
• Each robot’s interpretation may be combined to form a much stronger interpretation for the entire swarm.
Developing the Swarm’s Interpretation
• Each robot has a list of bearings to potential objects. This information can be visualized as rays originating from the robot’s location.
• The intersections of these rays represent potential objects in 3D space.
• Many intersections are bogus.
• Many intersections conflict with others – each ray can only correspond to one object.
Culling False Objects• Algorithm: group compatible locations together. This yields disjoint
sets of intersections that can coexist.
• The set with the most supporting evidence wins.
Combining the Individual and Swarm Interpretations
• At this point we have a set of objects with 3D locations.
• Individual robots can provide silhouettes of the objects.
• This information may be combined to create a 3D shape.
• Incorporating past history can strengthen our conclusions.
3D Hulls• Each camera contributes a silhouette of an object, and a
ray on which the silhouette lies.• Projecting the silhouette along the ray forms a “cone”.• The intersection of these cones carves a 3D solid; imagine
pushing cookie cutters through space.• The solid is guaranteed to enclose the true shape, but will
be convex, i.e., ignores indentations. • Such an upper bound is termed a hull.• Hulls are typically represented as a mesh of triangles.
Applications of 3D Hulls
• The 2D silhouettes can be “painted” on the mesh.
• The solid can be rendered from any angle.
• 3D shape may be used to classify objects – threat assessment, for example.
• Meshes may be recorded for future use…
Establishing Object Tracks
• Object records from successive frames can be combined to establish a log of known objects.
• These “object tracks” can aid future processing, establishing a positive feedback loop:
– Distinguishing between one large object, and two objects close to one another.
– Past motion can predict where objects will be located, minimizing the occlusion problem mentioned earlier.
– Past hulls can predict how an object’s silhouette will appear in each camera.
Summary
• Swarms of camera-equipped robots can collaborate to track and model objects in space.
• The swarm’s results are more concrete than any individual’s observations.
• Observation is passive and uses relatively few resources (weight, energy, money).