spie conference v3.0
TRANSCRIPT
Computation and Design of
Autonomous Intelligent
Systems
Robert L. Fry
Presentation to the SPIE Defense and Security Conference
Orlando, FL
March 17, 2008
This work was supported through AFOSR contract FA9550-06-1-0297 under Dr. Robert Bonneau
Outline
• Computational Theory of Autonomous Intelligent Systems
• Engineering intelligent Systems
• Neural Computation and other Sample Applications
• Discussion and Next Steps
Computational Theory of
Autonomous Intelligent Systems
Basic Idea
To engineer an intelligent system, one must have a working definition for what intelligence is. The following is suggested:
“An intelligent system acquires information and uses it to make decisions in service to its computational goal”
If we can computationally quantify each of the
above highlighted terms, then we might have a
formal basis for engineering intelligent systems.
Definition Flowdown
To acquire information, a system must pose a question to its environment
To make decisions, a system must answer a question of what to do.
What is a Question?
What is a Question?
If we can come up with a working
definition of questions, then we are in
business.
“What Is a Question?”
“A question is defined by the set of
all subjectively possible answers”
Richard Cox, Physicist
Johns Hopkins University
(1898-1991)
Dr. Richard Cox1 of the Johns Hopkins
University Physics Department
developed a joint algebra of questions
and assertions
1 “Of Inference and Inquiry,” Proc. First Maximum Entropy Workshop, MIT, 1978
His extension of logic mathematically captures how
computation is performed within the subjective
frame of a system.
Boolean Algebra of Questions and Assertions
~~a = a
a a = a a a = a
a b = b a a b = b a
~(a b) = ~a ~b ~(a b) = ~a ~b
a b c = a (b c)
= (a b) c
a b c = a (b c)
= (a b) c
(a b) c = (a c) (b
c)
(a b) c = (a c) (b
c)(a b) b = b (a b) b = b
(a ~a) b = b (a ~a) b = b
a ~a b = a ~a a ~a b = a ~a
Algebra AssertionsAlgebra Assertions
~~A = A
A A = A A A = A
A B = B A A B = B A
~(A B) = ~A ~B ~(A B) = ~A ~B
A B C = A (B C)
= (A B) C
A B C = A (B C)
= (A B) C
(A B) C = (A C)
(B C)
(A B) C = (A C) (B
C)(A B) B = B (A B) B = B
(A ~A) B = B (A ~A) B = B
A ~A B = A ~A A ~A B = A ~A
Algebra of QuestionsAlgebra of Questions
Logical Basis of
Probability Theory
Logical Basis of
Information/Control
Theories
• Logical questions quantify the computational rules of intelligence and autonomy
Dual Boolean Algebras
Conventional
Logic
Logic of
Questions
Let upper case italicized letters like A denote questions,
e.g., A “Is it an apple or not?” {a, ~a}.
Probability and Entropy
Property Probability Theory Bearing Theory
(Entropy)
Conjunctive Rule p(ab|c)=p(a|c)+p(b|c)p(ab|c) b(AB|C)=b(A|C)+b(B|C)b(AB|C)
Normalization p(a|c)+p(~a|c) = 1 b(A|C)+b(~A|C) = 1
Marginalization Rule p(ab|c)+p(a~b|c)=p(a|c) b(AB|C)+b(A~B|C)=b(A|C)
Bayes’ Theorem p(ab|c)=p(b|c)p(a|cb) b(AB|C)=b(B|C)b(A|CB)
• Probability and Entropy (called Bearing) are derivable from logic and
consistency • Respectively they are the unique measures of subjective knowledge
and uncertainty as computed within a local system frame
Sample Identities in probability and information theory
A Simple “Intelligent” System
Decision Space Answer
Y: Cilia turned On or Off
Cilia
A protozoan-like
system asks X and
answers Y
Creatures
Optical Field-of-
View
Currents randomly perturb
creatures orientation
Information Space Ask X: Detector is On or Off
Light Source
and Possible
Food
Linear
Motion
Detector
There are two kinds of questions – those that can be asked and those that can be answered by a system.
Logical Questions: Card-Guessing Game Example
S{ , , , } C{ } ,
S“What Suit is the Card?” C“What Color is the Card?”
For systems acquiring information X and making decisions
Y, and so X Y is the actionable information of the system.
Define the Questions S and C:
Disjunction (Logical OR) provides the information asked by either
S C { , , } , , , { } , C
Conjunction (Logical AND) provides the information asked by both
SC { ■ , ■, ■ , ■, ■ , ■, ■ , ■}
{ , , , } S
Decision Space Y:
Cilia turned On or
Off
Cilia
Information Space X: Detector is On or Off
Detector
Intelligence and Autonomy
x=0 y=0
x=1 y=0
Never Do
Anything
x=0 y=1
x=1 y=1
Always Do
Something
x=0 y=1
x=1 y=0
Always Do Wrong
Thing
Possible Behaviors
Four computational mappings
comprise possible system
behaviors. The last matches
decisions to available
information and so XY =X =Y
x=0 y=0
x=1 y=1
Always Do Right
Thing
XY=X
Engineering Intelligent
Systems
Intelligence and Computation
Thermodynamics
Intelligence Information
Theory
Computation
and
Questions
Theory exploits and extends
concepts and tools from
thermodynamics and
information theory and in
turn enriches them.
• Entropy
• Source and channel coding
• Shannon’s dual problems
• *Dual-matching concept
• Entropy
• Maximum entropy principle
• Shannon’s dual problems
• Carnot cycle
• Dual-matching
• *Carnot cycle
• Maximum entropy principle
• Entropy
Critical Domain Concepts
Thermodynamics
Intelligence Information
Theory
Computation
and
Questions
Carnot Cycle from Thermodynamics and Dual-
Matching from Information Th. are especially important
to intelligent systems
Carnot Cycle
Dual-
Matching
Information Theory and Intelligence
Source X Receiver Y
Answer question of what
to transmit
Question answered as to
what is received
Information Theory
Decisions Y Acquire X
Question posed to
environment answered
Question answered as
to what to do
Intelligence Theory
• Information theory and intelligence theory are functional duals
• Latter describes how to make decisions with uncertainty
• Methods and constructs in one domain apply to the other
Dual-Matching Concept
• Just as the Carnot cycle from Thermo applies to intelligent systems, so does the recent concept2 of Dual-Matching from information theory
• Dual-matching provides the quantitative basis for the design and operation of efficient intelligent systems (their Carnot Cycles)
• Dual-matching requires simultaneously solving Shannon’s dual problems:
– Minimize information required
– Maximize what can be done with acquired information
2 M. Gastpar, To Code or Not to Code, Ph. D dissertation, Thèse EPFL, no 2687, Ecole
Polytechnique Fédérale de Lausanne, 2002. This concept is revolutionizing information theory.
Computation and Carnot Cycles
4. Erase Information
2. Store Information
3. Make Decision
1. Acquire Information 1. Make Decision
3.0 Acquire Information
4. Store Information
2. Erase Information
• Carnot engine
• Internal combustion engine
• Communication systems
Area = Useful Work
Produced
Area = Energy
Required to Operate
• Carnot refrigerator
• Computer contol
• Intelligent systems
• Logic dictates that there are two kinds of computation a
system can perform
• These correspond to the two Carnot Cycles of
thermodynamics
Sample Applications
Neural Computation
Using the described theory and methods, one
can perform a top-down design of pyramidal
neurons as found in brains. This is a simple,
elegant, and informative example.
What Do Neurons Ask and Answer?
Axon
Comprises
Single
Output
Y={0,1}
X1
X2
X10000
104 Synaptic
Inputs Xi={0,1}
Soma Integrates
Inputs and Makes
Decisions
• There are 210000 possible
answers (microstates)!
• The neuron poses the
question X=X1X2…X10000
• Neuron simply answers Y
Assume matching of
actionable information
to decisions is its goal:
XY = Y.
Principal of Maximum Entropy
• The Principal of Maximum Entropy basic to
thermodynamics dictates neural probability distribution
Resulting Max-Ent Distribution
Partition (normalization)
Function
0
p( , | ) ln p( , | )
p( , | )[( , ) , ]
p( , | )[ ]
p( , | ) 1
y Y X
T
y Y X
y Y X
y Y X
J y a y a
y a y y
y a y y
y a
x
x
x
x
x x
x x x
x
x
exp ( , )p( , )
T y yy
Z
xx
exp ( , )TZ y y
x y
x
• Lagrange multiplier is coupling strengths and scalar
is somatic decision threshold
Dual-Matching Adaptation
Hebbian Gating
for (1) – (3)
.
.
.
1
1
n
Y
X1
X2
Xn
.
.
.
1
n
X
Y
Three Hebbian Learning Equilibria Result that Can be
Realized Using Simple Biologically Plausible Algorithms
1) Threshold Adaptation () The Optimal Decision Threshold Is
Average Somatic Potential
( ) (1 ) ( )t t t T
x
3) Delay Equalization () Elements of the Optimal Time Delay
Vector Must Satisfy “Momentum”
Equalization:
di / dt = i y(t) dxi(ti)/dt = 0
2) Gain Adaptation ()
The Optimal Gain Vector Is the Largest
Eigenvector of the Input Covariance Matrix R
( ) ( ) ( )[ ( ) ( ) ( )]t t t t tx x x
| 1 | 1T
E y y x x x xR( )
Neural Carnot Cycle
4. Erase Information
during the refractory
period
3. Decision Made
by soma
0.9 Bit/Decision
T=1/
1. Acquire Information
through synapses
2. Information
Stored in
soma
T=1
T=0.2
H(X)=b(X|A)
Z=2n
I(X;Y)=b(XY|A)
Z=2n+1
Z=2n Z: 2n2n+1
A single neuron operates as a Carnot refrigerator as do
all intelligent systems. It has ~90% Carnot Efficiency.
Tem
pera
ture
Entropy
Simulation of Neural
Computational Model
Modeling and Simulation Training Set
Training Vector Bit
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1
2 0 1 0 0 0 1 0 0 0 0 0 1 1 1 0 1 1 1 0 1
3 1 1 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1
4 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1
5 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1
6 1 1 1 1 1 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1
7 0 1 0 1 0 1 1 0 1 0 0 0 0 1 0 1 1 0 0 1
8 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1
9 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1
10 1 0 0 1 0 0 0 1 1 0 1 1 1 1 0 1 1 0 0 1
11 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 1 1
12 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1
13 1 1 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 0 1
14 1 0 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 1 0 1
15 0 0 0 0 1 1 1 1 0 0 1 0 1 0 1 0 0 1 1 1
16 1 1 1 1 0 1 1 0 1 0 0 0 1 0 0 1 1 0 0 1
17 0 1 1 0 1 1 1 0 0 0 1 1 1 1 0 1 0 0 1 1
18 1 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 1 1
19 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 1
Vecto
r I
nd
ex
20 1 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 1
Hamming Distance Between
Vectors Conveys Structure
Code Number
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7
2 7 0 11 8 7 12 6 8 7 8 7 8 7 9 13 8 7 9 9 10
3 10 11 0 9 10 7 9 9 10 9 8 9 10 12 10 9 14 10 6 5
4 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6
5 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7
6 7 12 7 6 7 0 8 6 7 10 7 6 7 9 13 6 9 11 9 6
7 3 6 9 4 3 8 0 4 3 8 5 4 3 9 17 4 9 11 11 6
8 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6
9 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7
10 7 8 9 8 7 10 8 8 7 0 11 8 7 9 13 8 11 9 7 8
11 6 7 8 7 6 7 5 7 6 11 0 7 6 10 14 7 8 8 8 7
12 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6
13 0 7 10 1 0 7 3 1 0 7 6 1 0 10 20 1 8 10 12 7
14 10 9 12 9 10 9 9 9 10 9 10 9 10 0 10 9 10 8 10 9
15 20 13 10 19 20 13 17 19 20 13 14 19 20 10 0 19 12 10 8 13
16 1 8 9 0 1 6 4 0 1 8 7 0 1 9 19 0 9 11 11 6
17 8 7 14 9 8 9 9 9 8 11 8 9 8 10 12 9 0 8 14 11
18 10 9 10 11 10 11 11 11 10 9 8 11 10 8 10 11 8 0 10 11
19 12 9 6 11 12 9 11 11 12 7 8 11 12 10 8 11 14 10 0 9
Co
de
Nu
mb
er
20 7 10 5 6 7 6 6 6 7 8 7 6 7 9 13 6 11 11 9 0
Figure 5.4: Hamming distances between the 20 codes listed in Table 5.1.
Simulation Output
Synaptic Gains
1 5 10 15 20 -0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Gains on non-informative
inputs are driven to zero.
Training vector bit
Vectors Inducing
Firing
1 5 10 15 20 0
1
The neuron learns to fire
on almost exactly half of
the training vectors.
Training vector index
7/18/2002
A Geometric View of Neural Computation
Input Information Space
contains >210000 codes
Neuron defines a
hyperplane
decision surface
Tx– = 0
Hyperplane Separates
Input Space into Two
Equally Probable
Regions: H(Y) = 1 bit
Fire!
(y = 1)
Do Not Fire!
(y = 0)
Theory provides a detailed
explanation of how
pyramidal neurons compute
Weapon System Applications
Weapon Systems and Dual-Matching
• Efficient system operation requires continual matching of weapon system information and control spaces
• Fire Control Loop operates as an engine governor
– Minimize fuel consumption under varying loading conditions
• All weapon systems acquire information and make decisions with uncertainty
• The fire control loop of a missile system is a good example:
Dual-Matching Process
X = Where can the target be? (Informational uncertainty)
Y= Where can the weapon go? (System control capacity)
X
Y
Weapon System Applications
• Algorithms Proposed or Under Development* – *Object correlation
– Sensor management
– *Discrimination and track fusion
– *Weapon-Target assignment
– *Guidance Laws1
• Weapon System Architectures – Distributed fire control for swarming intelligence
– C2BMC
– Networked weapon systems have unique Boolean expressions
1 Example of the derivation of a guidance law with uncertainties in tracking, discrimination,
and guidance is given in the backup slides and a paper is available on request.
Discussion and Next Steps
• Continue to model for biological computation emphasizing the development of Cortical Systems
• Begin wider formulations of weapon systems problems with focus on ballistic missile defense
– Decisioning uncertainty is hallmark of the BMD problem
• Military and data networks have simple formulations
– Networks have natural functional decompositions lending themselves to global optimization
Main issues in going forward?
• Documenting work done so far - perhaps in book form
• Bridging the multidisciplinary boundaries of thermodynamics, information theory, and intelligence
– Only a modest understanding of each area seems sufficient
Acknowledgements
• This work was supported by AFOSR Project IONS - Information-Theoretic Optimization of Networked Systems
• Project IONS is 3-Year joint SEG/Princeton project to develop an engineering framework for distributed intelligent systems
• My co-PI is Dr. Mung Chiang of Princeton who has developed a highly efficient information optimization methods based on Geometric Programmingg
Selected References
[1] “Double Matching: The Problem that Neurons Solve,” Computational Neuroscience Meeting, Neurocomputing, 69, pp. 1086–1090, 2005.
[2] “Neural Statics and Dynamics,” Computational Neuroscience Meeting, Neurocomputing, 65, pp. 455-462, 2005.
[3] “Logical and Geometric Inquiry,” Proc. Workshop on Maximum Entropy and Bayesian Methods, 659, pp. 243-280, 2003, American Institute of Physics.
[4] “A Theory of Neural Computation,” Computational Neuroscience Meeting Neurocomputing, 52, pp. 255-263, 2002.
[5] “Neural processing of information,” Proc. International Symposium on Information Theory, Trondheim, Norway, pp. 217, 1994.
[6] “Cybernetic Defense Systems,” Proc. of the MD SEA Conference held in Monterey, CA, February 2001.
[7] et. al A Fokker-Planck Model For A Two-Body Problem, Proc. Maximum Entropy and Bayesian Methods, 617, pp. 340-371, 2002.
[8] “Cybernetic systems based on inductive logic,” 2000 Maximum Entropy and Bayesian Methods Conference, Gif sur Yvette, France
[9] “Constructive methods for BMD algorithm design and adaptation,” Phase III of the Battlespace Study Final Report, JHU/APL, 2000.
[10] “Multi-sensor fusion using information geometry,” presented at the 1999 Maximum Entropy and Bayesian Methods Conference, Boise, Idaho.
[11] “Transmission and transduction of information,” Presented at the Workshop on Maximum Entropy and Bayesian Methods, Garching, Germany, 1998.
[12] “An analytical basis for TBMD system design,” Proc. 1998 National Fire Control Conference, San Diego, CA.
[13] “Observer-participant models of neural processing,” IEEE Trans. Neural Networks, 6, pp. 918-928, July, 1995.
Backup Slides
A Neuron is an Intelligent System
• Synapses X and axon Y are elementary questions
• Laboratory confirmation of predicted energy allocations
• Significant numbers of biological results and predictions
• Partition function and critical temperature determined
Axon Y
(Output)
Soma
(Decides)
H(Y)
I(X;Y)
~104 Dendritic Inputs
and 1 Output
Theory provides a detailed
explanation of how
pyramidal neurons compute
, , 1
0( ) ( , ) log ( , ) { } { }( | )
( )
n
i i
X Y X Y i
J p y p y E x y E yp y
p y
λ x xx
Midcourse BMD Example
• Perform a distributed computation to achieve global optimization
• Dynamically match actionable information X and decision space Y
Distributed C2BMC
Dual-Matching
Algorithms
Sensor measurements eliminate possible
target locations
Question X “Where is the target?”
Sensing
Elements
Radars
Seekers
Guidance decisions eliminate possible places
missile can go
Question Y “Where should I go?”
Kinematic
Elements
Boosters
KVs
BMD System Architectures • Weapon architectures can be quantified
• Basis for comparing architectures and making trades
• Decomposition allows representation of entire network
• Formulation applies to any kind of weapon or warfare:
I. 1-on-1 I(X;Y) Single KV vs. Single RV (EKV)
II. M-on-1 I(X;Y1,Y2, …,Y) Multiple KVs vs. Single RV (MKV)
III. 1-on-N I(X1,X2,…,XN;Y) Single KV vs. MIRVed Threat
IV. M-on-N I(X1,X2,…,XN;Y1,Y2, …,YM) Multiple KVs vs. MIRVed Threat
Architectural
Class
Basic Objective
Function BMD Architecture
Bullets = KVs = Infantry = Missiles
Guidance with Targeting and Guidance Uncertainties
Will now determine the exact analytical solution to a simple 1-
dimensional problem to demonstrate some of the described
concepts.
Tracks will never be perfect, but this does
not change how the problem is solved.
X
x0
x1
Target Localization
Space Target localization with perfect tracks at x0 and x1
and resp. discrimination probabilities p0 and p1.
respectively (p0+p1=1):
p(x) = p0(x-x0) + p1(x-x1)
This is where the target can be
MKV Architectures
Y1 Y2 Ym
X1 X2 Xm
. . .
a) Federated architecture
Y1 Y2 Ym
X1 X2 Xm
. . .
b) Swarming architecture
(Only nearest-neighbor comm)
Z2
Z1
Z3
Z2
Zm
Zm-1
Y1 Y2 Ym
X1 X2Xm. . .
c) Centralized control architecture
ZmZ2Z1
XC
I1(X;Y)
I3(X;Y)
I2(X;Y)
Potential to trade system architecture vs.
weapon system capacity vs. cost
Weapon System
Capacity
Cost ($)
A1 A3
A2
A1 A2 A3
Guidance with Targeting and Guidance Uncertainties
Can determine optimal guidance solutions in the
presence of tracking, discrimination, and guidance
uncertainties
X
x0
x1
Target localization with perfect tracks at x0 and x1
and resp. discrimination probabilities p0 and p1.
respectively (p0+p1=1):
Tracks will never be perfect, but this does
not change how the problem is solved.
Target Localization
Space
0 0 1 1( ) ( )p x x p x x
Guidance with Targeting and Guidance Uncertainties
The kinematic space of the system is the reachable space of
the interceptor after it selected some point as its next guide
point – the missile always has to be going somewhere!
Y
x0
x1
Missile Kinematic
Space (After Maneuver)
’
The guidance algorithm objective is to
determine the optimal as its next guide point
given guidance and informational uncertainties
of p(y|x) and p(x).
Varying changes the kinematic space p(y|x)
once the command is executed.
2
2
( )
2( | ) ( )
y
p y x y x e
Guidance with Targeting and
Guidance Uncertainties
The idea is to minimize the mutual information over subject to
a probability-of-miss constraint (a “miss” is like a transmission
bit error):
This guidance law minimizes statistically the fuel expenditure over
the ensemble of such engagements with these uncertainties.
( , ) I( ; | ) 1 ( )J X Y p y d where
,~
( | )I( , ) ( ) ( | ) log
( )Y y yX
p y xX Y dx p x p y x
p y
If no constraint, then J0 and the missile guides halfway between
the objects if both will remain in its kinematic space:
* 0 1
2
x x
Otherwise the missile should navigate to the probability centroid
between the objects:
*
0 0 1 1p x p x
Guidance with Targeting and Guidance Uncertainties
The kinematic space of the system is the reachable space of
the interceptor after it selected some point as its next guide
point – the missile always has to be going somewhere!
Y
x0
x1
Missile Kinematic
Space (After Maneuver)
’’
’
The guidance algorithm objective is to
determine the optimal as its next guide point
given guidance and informational uncertainties
of p(y|x) and p(x).
Initially assume that missile is guiding half way
between the objects (x0+x1)/2 implying that
discrimination information is not used.
(x0+x1)/2
2
2
( )
2( | ) ( )
y
p y x y x e
(Probability missile can get to y when target is at x)
Guidance with Targeting and
Guidance Uncertainties
The idea is to minimize the mutual information over subject to
a probability-of-miss constraint (a “miss” is like a transmission
bit error):
( , ) I( ; | ) 1 ( )J X Y p y d If no constraint, then J0 and the missile continues guiding
halfway between the objects:
* 0 1
2
x x
x0 = 0
x1 = 1
p0=0.25
p1=0.75
2=1
Example
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0
0.005
0.01
0.015
0.02
0.025
0.03
mu
Tra
nsacte
d C
ontr
ol (b
its)
Control Rate vs.
Guide Point
Targeting
information ignored
Guide to most
probable target.
This minimizes
probability of
error but not fuel.
PIP
x0=0, x1=1, p0=p1=0.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.94
0.945
0.95
0.955
0.96
0.965
0.97
0.975
Guide Point mu
Pro
ba
bility o
f H
it
=2.0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.8
0.81
0.82
0.83
0.84
0.85
0.86
0.87
0.88
0.89
Guide Point mu
Pro
ba
bility o
f H
it
=1.0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.57
0.575
0.58
0.585
0.59
0.595
0.6
0.605
0.61
0.615
Guide Point mu
Pro
ba
bility o
f H
it
=0.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.45
0.46
0.47
0.48
0.49
0.5
0.51
0.52
0.53
0.54
Guide Point mu
Pro
ba
bility o
f H
it
=0.4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Guide Point mu
Pro
ba
bility o
f H
it
=0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Guide Point mu
Pro
ba
bility o
f H
it
=0.1
Evolution of PIP During Homing for Minimized Probability of Error
Centered PIP Broadened PIP
Begin Object
Commit
Object
Committed To
Object
Committed To
Centered PIP
Cortical Model
. . .
. . .
. . .
. . .
. . .
. . .
. . .
Y1 Y2 Y3 Ym
X1 X2 Xn
. . .
Pyramidal neurons (Gibbs Sampler) elements
Cortices consist of a collection of pyramidal neurons whose properties
are now well established
• Contains m pyramidal neurons
and has n inputs
• Full connectivity is unnecessary
and in general is sparse
Cortical Hamiltonian
1
1 1
m
i
i
n m
i ij j ik k i i
j k
H H
H x y y
Since energy is an extensive property of a system, the cortical
Hamiltonian is just the sum of the constituent single-neuron
Hamiltonians:
Network must then have a well-defined Boltzmann distribution:
exp ( , )( , )
H x yp x y
Z
By way of analogy with statistical physics, all network
macroscopic properties are defined including Free Energies,
entropies such as H(X,Y), and most importantly, mutual
information.
Cortical Circuit WTA Computation
. . .
. . .
. . .
. . .
. . .
. . .
. . . • K credible targets with N = 1 true but unknown target
• M bullets (KVs)
• Neurons or Gibbs Sampling Elements
Y1 Y2 Y3 YM
X1
X2
XK
• Network determines optimal p(y|x) to
min I(X;Y) subj. to constraints
• Connectivity requirements driven by
need to be able to stabilize to
asymptotic distribution
• Output is probabilistic assignment
which is desirable from guidance
perspective
• Local nodes are identical IT-neurons
where local=global objective function
optimization
• Network has partition function and
likely critical temperatures
• Gibbs () and Helmholtz () Free
energies ( Saddle Points)
• Network Hamiltonian:
Discrimination Information p(xk) 1 1 1
K M MT T
ik ik ik ik ik ik ik ik
k i i
H y y y
λ x ν y
A grasshopper stealthily evades the photographer over a 30-second sequence of frames
Example of the Implication of Assertions
Define the subjective inquiry B “Is it a Boy?”
Then let b “It is a Boy!” and s “It Is My Son!”
s b =
=
If Asserted … Then Known …
• If “It is My Son!” is Asserted, then this Additional Information is
Erased by B
• Implication means that if b answers the question B, then so does s
Holds Only Relative
to Question B
Theory Summary
• Distinguishability gives rise to Boolean Algebras – Exhaustion and Mutual
Exclusion Together Yield Complementarity
• Logical implication is the natural relational operator for questions and assertions
• Probability and Entropy are the corresponding natural subjective measures of degrees of implication
“Nothing”
Assertions Questions
Probability
Theory
Information
Theory
Intelligent
Systems
System Distinguishes
Boolean
Algebra and
Logical
Implication
Induce Computational Rules and Natural
Measures of Probability and Entropy