robot vision: multi-sensor reconnaissance. overview an individual robot can develop an...

Robot Vision:Multi-sensor Reconnaissance

Overview

• An individual robot can develop an interpretation about its environment.

• Groups of robots can combine their resources to develop an improved, collective interpretation.

• Such a “swarm” is reliant on centralized processing to refine the collective interpretation.

• The swarm has more insight than solitary individuals.

Single-sensor Vision

• Our setup involves numerous robots, each with a single video camera.

• What can a single-camera robot see?

• A background image may be used to establish a baseline of irrelevant imagery.

• Anything that differs from the background is considered relevant – background subtraction.

minus equals

Making Sense of Pixel Data• Background subtraction and classical image processing algorithms can

mark every pixel in every frame as foreground (interesting) or background (uninteresting).

• Conventional techniques can clean noise.

• Regions of contiguous foreground pixels are likely to constitute a single object: a “blob.”

• This assumption is false when one object occludes another.• Later we will see a way around this.

Developing the Individual’s Interpretation

• Pixel-space blobs have measurable properties: color, shape, and location.

• Geometric properties (location, size) are relative to the camera’s perspective.

• A single robot can provide a list of objects with a bearing to each, but has no depth perception.

• Each robot’s interpretation may be combined to form a much stronger interpretation for the entire swarm.

Developing the Swarm’s Interpretation

• Each robot has a list of bearings to potential objects. This information can be visualized as rays originating from the robot’s location.

• The intersections of these rays represent potential objects in 3D space.

• Many intersections are bogus.

• Many intersections conflict with others – each ray can only correspond to one object.

Culling False Objects• Algorithm: group compatible locations together. This yields disjoint

sets of intersections that can coexist.

• The set with the most supporting evidence wins.

Combining the Individual and Swarm Interpretations

• At this point we have a set of objects with 3D locations.

• Individual robots can provide silhouettes of the objects.

• This information may be combined to create a 3D shape.

• Incorporating past history can strengthen our conclusions.

3D Hulls• Each camera contributes a silhouette of an object, and a

ray on which the silhouette lies.• Projecting the silhouette along the ray forms a “cone”.• The intersection of these cones carves a 3D solid; imagine

pushing cookie cutters through space.• The solid is guaranteed to enclose the true shape, but will

be convex, i.e., ignores indentations. • Such an upper bound is termed a hull.• Hulls are typically represented as a mesh of triangles.

Applications of 3D Hulls

• The 2D silhouettes can be “painted” on the mesh.

• The solid can be rendered from any angle.

• 3D shape may be used to classify objects – threat assessment, for example.

• Meshes may be recorded for future use…

Establishing Object Tracks

• Object records from successive frames can be combined to establish a log of known objects.

• These “object tracks” can aid future processing, establishing a positive feedback loop:

– Distinguishing between one large object, and two objects close to one another.

– Past motion can predict where objects will be located, minimizing the occlusion problem mentioned earlier.

– Past hulls can predict how an object’s silhouette will appear in each camera.

Summary

• Swarms of camera-equipped robots can collaborate to track and model objects in space.

• The swarm’s results are more concrete than any individual’s observations.

• Observation is passive and uses relatively few resources (weight, energy, money).

robot vision: multi-sensor reconnaissance. overview an individual robot can develop an...

Documents