a theory of universal ai - uic computer sciencepiotr/cs594/prashant-universalai.pdfa theory of...
TRANSCRIPT
A Theory of Universal AI
LiteratureMarcus Hutter
Kircherr, Li, and Vitanyi
PresenterPrashant J. Doshi
CS594: Optimal Decision Making
A Theory of Universal Artificial Intelligence – p.1/18
Roadmap
Claim
Background
Key Concepts
The AI � Model in Functional Form
The AI � Model in Recursive Form
The Universal AI
�
Model
Example Constants and Limits
ApplicationSequential Decision Theory
Conclusions
A Theory of Universal Artificial Intelligence – p.2/18
Claim
Development of a universally optimal AI model
Universal � parameterless, unbiased, model-free
Optimal � No other program can learn or solve the taskfaster
A Theory of Universal Artificial Intelligence – p.3/18
BackgroundDecision Theory
Solves the problem of rational agent behaviour in uncertainworlds given an environment
Known prior probability distribution � over the environment
Solomonoff’s Universal InductionSolves the problem of sequence prediction for an unknown priordistribution
Predict the continuation ��� of a given binary sequence ��� ���� �
� �� � ��� � ��� � ��� �� �� ��� � �
A Theory of Universal Artificial Intelligence – p.4/18
Background
Solomonoff’s result:� expected Euclidean distance between
�
and � is finite����� � ��� � � �� � �� � � �� ��� ��"! � � �� ��� �� � # $&% ' ( )+* (-, . � �
Convergence )/10� 2 � � �� � ��� � �43 � �� � ��� � �
�
is the Universal Probability Distribution� � � 576 (� 8 9�: ; , < � �
� � � � (� 8 9� ; $>= . � �
is the Kolmogorov Complexity of �
A Theory of Universal Artificial Intelligence – p.5/18
Key Concepts
The Cybernetic or Agent Model
?@ A B C D B
|| EGFH I J ? KML N I O
P is partial function / chronological Turing machineQ @ D B C A B|| L FH I J Q K ERFH I O
S is partial function / chronological Turing machine
A Theory of Universal Artificial Intelligence – p.6/18
Key Concepts
History :
TVU W X U W YZ T[ X [ T\ X \ T] X ]_^ ^ ^ T Wa` [ X Wa` [Probability of input given the history
b c X W d T U W X U W T W e Z b c TG[f W X [f W e
b c T U W X U W e
b c Tg[f W X [f W e Z b c X W d T U W X U W T W e b c X W` [ d T U Wa` [ X U Wa` [ T W` [ e^ ^ ^
^ ^ ^ b c X \ d T U \ X U \ T\ e b c X [ d T[ e
A Theory of Universal Artificial Intelligence – p.7/18
The AI Model in Functional Form
Task: Derive h i
which maximizes the total credit over apredefined lifetime(T)
For a known deterministic environment jk �ml _npo j �rq � lts � � u � s�
Optimal policy
n v q � w ux 0 w �y k �z _npo j �|{ k �z n vo j � 6 k �z n o j �} n
For a prior distribution over environments � j �
Let
~�� �q � � jq j ~4� � ���� ~ ��� �� be the set of all environments thatproduces the history
~ � � �~ �� �q � ~ � � ~ �� ~ � # ~ � # ~ � �� � ~ � �� �k � �ml _n � ~ � � �~ ��� ��rq � � � ��� � � j � k �ml npo j �
A Theory of Universal Artificial Intelligence – p.8/18
The AI Model in Functional Form
AI � maximizes expected future reward over the next� W Z � W� �g� �
(horizon) cyclesOptimal policyn v �q � w ux 0 w �y� y 9 �� � � � ��� � � ; � � � � � � � k � �ml _n � ~ � � �~ ��� ��
n v ~ ��� �� ~ � � �� q � n v � ~ �� �� n v �� � ~ ��� �� � � n v� �
Best output
~ � �q � w ux 0 w �� � 0 w �y k � � l n � ~ � � �~ ��� ��
h i
is computable if�
,�
and � W are finite
A Theory of Universal Artificial Intelligence – p.9/18
The AI Model in Recursive Form
Task: Derive expected reward sum in cycles�
to �using expected reward sum in cycles
�� �to �k � �ml ~ � � �~ ��� �� �� q � � � � u � ���� k v ��$�a� l ~ � � � �~ �� � ��� � � �� ~ � � �~ ��� �� ��
Optimal expected rewardk v ��ml ~ � � �~ ��� �� q � 0 w �� � k � �ml ~ � � �~ ��� �� ��
k v �� l ~ � � �~ �� �� q � 0 w �� � � � � u � ���� k v ��$� � l ~ � � �~ ��� �� �� ��� � � �� ~ � � �~ ��� �� ��
�V�� �
output~ � �q � w ux 0 w �� � k � �ml ~ � � �~ �� �� ��Expectimax sequence~ � �q � w ux 0 w �� � � �0 w �� �� � � �� � 0 w ���� � � � u � �� � � u � l ��
� � �� l � ~ � � �~ ��� �� �� l �
A Theory of Universal Artificial Intelligence – p.10/18
The AI Model in Recursive Form
Functional AI � Recursive AI �
� c TR[f W X [f W e¢¡£f £ ¤¦¥§G¨ © ªp« ¬§G¨ ©
� cM e
A Theory of Universal Artificial Intelligence – p.11/18
The Universal AI Model
Task: Replace the true but unknown prior probability �
with
�
In the Functional AI � model
~ � �q � w ux 0 w �� � 0 w �y� y 9 �� � � � ��� � � ; � ��� � � � � �� � 9 �� � � ; � �� � � � j � k �ml _npo j �
®
~ � �q � w ux 0 w �� � 0 w �y� y 9 �� � � � ��� � � ; � ��� � � � � �� � 9 ��� � � ; � �� � � (� 8 9 � ; k �ml _npo j �
A Theory of Universal Artificial Intelligence – p.12/18
The Universal AI Model
Task: Replace the true but unknown prior probability �
with
�
In the Recursive AI
�
model
~ � �q � w ux 0 w �� � � � 0 w �� �� �� � � � 0 w ���� � �� u � ���� � u � l ��
� � �� l � ~ � � �~ ��� �� �� l �®
~ � �q � w ux 0 w �� � � � 0 w �� �� �� � � � 0 w ���� � �� u � ���� � u � l ��
� � �� l � ~ � � �~ ��� �� �� l �
A Theory of Universal Artificial Intelligence – p.13/18
The Universal AI Model
Task: Show the convergence of AI
�
to AI �Utilize the Solomonoff’s result generalized from
¯= 2 to an
arbitrary alphabet� sare pure spectators����� � ��� � � �� � �� � � � �� � � �� ��� �� � ��! � � �� �� �� � �� � # $t% '( )+* (, . � �
)/0 � 2 � � � �� �� �� � �� 3 � � �� ��� �� � ��
Outputs
° T W of the AI�
model converge to the outputs
° T W
of the AI � model atleast for the bounded horizon� W Z �� �� �A Theory of Universal Artificial Intelligence – p.14/18
Example Constants and Limits
1
± ² T W X W ³ � ´ µ d �·¶ � d1
¸ ¹ [º » ¹ \ ¼ ½ ´ ¹ ] \ ¾ ¹º ¿ ¿ ]º
(a) The agents interface is wide
(b) The interface can be sufficiently explored
(c) The death is far away
(d) Most input/output combinations do not occur
These limits are never used in proofs but we are only interested in theorems which do not degenerateunder the above limits
A Theory of Universal Artificial Intelligence – p.15/18
Application
Sequential Decision Theory (MDP)Bellman equation for optimal policy
n v / ��� w ux 0 w �À Á  À sÁ à v Ä �o à v / ��� Å / ��� 0 w �À Á  À sÁ à v Ä �
Apply the AI � model
ÆÈÇ ÉÊ Ë Ì ÍÏÎ Ð ÇÊ Î Ñ Ç Ò © Î Ó ÔÖÕ× Ç Ø É+Ù © Ú Ò>Û © Ù Û © Ò © Í
Ü Ç Ò Û © Ù Û © Î Ý É Ü Í Ç Þ ÉÙ ©+ß § ÍÏÎ à á É Ü Í Ç â á©+ß §äã å É Ò>Û © Ù Û © Í Ç Þ ÉÙ ©+ß § Í�æ â á© å É Ò>Û © Ù Û © Í
ç Ç Ò§¨ © Ù §¨ © Î Ý É ç Í Ç Þ É+Ù © ÍÏÎ à á É ç Í Ç â á© å É Ò§¨ © Ù §¨ © Í Ç Þ ÉÙ © Íæ â á©è § ã å É Ò§¨ © Ù §G¨ © Í
A Theory of Universal Artificial Intelligence – p.16/18
Application
ObservationsWe use the complete history as the environment state
The AI � model does not assumeMarkovian propertystationary environmentaccessible environment
Other applicationsGame Playing
Function Minimization
Supervised Learning
Bold Claim: AI � is the most general model
A Theory of Universal Artificial Intelligence – p.17/18
Conclusion
A parameterless model of AI based on Decision Theoryand Algorithmic Probability is presented
Makes minimal assumptions about the environment
Is the AI
�
model computable?
Future WorkDerive value and reward bounds for AI
�
model
Apply AI
�
model to more problem classes
A Theory of Universal Artificial Intelligence – p.18/18