a theory of universal ai - uic computer sciencepiotr/cs594/prashant-universalai.pdfa theory of...

18
A Theory of Universal AI Literature Marcus Hutter Kircherr, Li, and Vitanyi Presenter Prashant J. Doshi CS594: Optimal Decision Making A Theory of Universal Artificial Intelligence – p.1/18

Upload: trinhcong

Post on 09-Mar-2018

223 views

Category:

Documents


1 download

TRANSCRIPT

A Theory of Universal AI

LiteratureMarcus Hutter

Kircherr, Li, and Vitanyi

PresenterPrashant J. Doshi

CS594: Optimal Decision Making

A Theory of Universal Artificial Intelligence – p.1/18

Roadmap

Claim

Background

Key Concepts

The AI � Model in Functional Form

The AI � Model in Recursive Form

The Universal AI

Model

Example Constants and Limits

ApplicationSequential Decision Theory

Conclusions

A Theory of Universal Artificial Intelligence – p.2/18

Claim

Development of a universally optimal AI model

Universal � parameterless, unbiased, model-free

Optimal � No other program can learn or solve the taskfaster

A Theory of Universal Artificial Intelligence – p.3/18

BackgroundDecision Theory

Solves the problem of rational agent behaviour in uncertainworlds given an environment

Known prior probability distribution � over the environment

Solomonoff’s Universal InductionSolves the problem of sequence prediction for an unknown priordistribution

Predict the continuation ��� of a given binary sequence ��� ���� �

� �� � ��� � ��� � ��� �� �� ��� � �

A Theory of Universal Artificial Intelligence – p.4/18

Background

Solomonoff’s result:� expected Euclidean distance between

and � is finite����� � ��� � � �� � �� � � �� ��� ��"! � � �� ��� �� � # $&% ' ( )+* (-, . � �

Convergence )/10� 2 � � �� � ��� � �43 � �� � ��� � �

is the Universal Probability Distribution� � � 576 (� 8 9�: ; , < � �

� � � � (� 8 9� ; $>= . � �

is the Kolmogorov Complexity of �

A Theory of Universal Artificial Intelligence – p.5/18

Key Concepts

The Cybernetic or Agent Model

?@ A B C D B

|| EGFH I J ? KML N I O

P is partial function / chronological Turing machineQ @ D B C A B|| L FH I J Q K ERFH I O

S is partial function / chronological Turing machine

A Theory of Universal Artificial Intelligence – p.6/18

Key Concepts

History :

TVU W X U W YZ T[ X [ T\ X \ T] X ]_^ ^ ^ T Wa` [ X Wa` [Probability of input given the history

b c X W d T U W X U W T W e Z b c TG[f W X [f W e

b c T U W X U W e

b c Tg[f W X [f W e Z b c X W d T U W X U W T W e b c X W` [ d T U Wa` [ X U Wa` [ T W` [ e^ ^ ^

^ ^ ^ b c X \ d T U \ X U \ T\ e b c X [ d T[ e

A Theory of Universal Artificial Intelligence – p.7/18

The AI Model in Functional Form

Task: Derive h i

which maximizes the total credit over apredefined lifetime(T)

For a known deterministic environment jk �ml _npo j �rq � lts � � u � s�

Optimal policy

n v q � w ux 0 w �y k �z _npo j �|{ k �z n vo j � 6 k �z n o j �} n

For a prior distribution over environments � j �

Let

~�� �q � � jq j ~4� � ���� ~ ��� �� be the set of all environments thatproduces the history

~ � � �~ �� �q � ~ � � ~ �� ~ � # ~ � # ~ � �� � ~ � �� �k � �ml _n � ~ � � �~ ��� ��rq � � � ��� � � j � k �ml npo j �

A Theory of Universal Artificial Intelligence – p.8/18

The AI Model in Functional Form

AI � maximizes expected future reward over the next� W Z � W� �g� �

(horizon) cyclesOptimal policyn v �q � w ux 0 w �y� y 9 �� � � � ��� � � ; � � � � � � � k � �ml _n � ~ � � �~ ��� ��

n v ~ ��� �� ~ � � �� q � n v � ~ �� �� n v �� � ~ ��� �� � � n v� �

Best output

~ � �q � w ux 0 w �� � 0 w �y k � � l n � ~ � � �~ ��� ��

h i

is computable if�

,�

and � W are finite

A Theory of Universal Artificial Intelligence – p.9/18

The AI Model in Recursive Form

Task: Derive expected reward sum in cycles�

to �using expected reward sum in cycles

�� �to �k � �ml ~ � � �~ ��� �� �� q � � � � u � ���� k v ��$�a� l ~ � � � �~ �� � ��� � � �� ~ � � �~ ��� �� ��

Optimal expected rewardk v ��ml ~ � � �~ ��� �� q � 0 w �� � k � �ml ~ � � �~ ��� �� ��

k v �� l ~ � � �~ �� �� q � 0 w �� � � � � u � ���� k v ��$� � l ~ � � �~ ��� �� �� ��� � � �� ~ � � �~ ��� �� ��

�V�� �

output~ � �q � w ux 0 w �� � k � �ml ~ � � �~ �� �� ��Expectimax sequence~ � �q � w ux 0 w �� � � �0 w �� �� � � �� � 0 w ���� � � � u � �� � � u � l ��

� � �� l � ~ � � �~ ��� �� �� l �

A Theory of Universal Artificial Intelligence – p.10/18

The AI Model in Recursive Form

Functional AI �   Recursive AI �

� c TR[f W X [f W e¢¡£f £ ¤¦¥§G¨ © ªp« ¬§G¨ ©

� cM­ e

A Theory of Universal Artificial Intelligence – p.11/18

The Universal AI Model

Task: Replace the true but unknown prior probability �

with

In the Functional AI � model

~ � �q � w ux 0 w �� � 0 w �y� y 9 �� � � � ��� � � ; � ��� � � � � �� � 9 �� � � ; � �� � � � j � k �ml _npo j �

®

~ � �q � w ux 0 w �� � 0 w �y� y 9 �� � � � ��� � � ; � ��� � � � � �� � 9 ��� � � ; � �� � � (� 8 9 � ; k �ml _npo j �

A Theory of Universal Artificial Intelligence – p.12/18

The Universal AI Model

Task: Replace the true but unknown prior probability �

with

In the Recursive AI

model

~ � �q � w ux 0 w �� � � � 0 w �� �� �� � � � 0 w ���� � �� u � ���� � u � l ��

� � �� l � ~ � � �~ ��� �� �� l �®

~ � �q � w ux 0 w �� � � � 0 w �� �� �� � � � 0 w ���� � �� u � ���� � u � l ��

� � �� l � ~ � � �~ ��� �� �� l �

A Theory of Universal Artificial Intelligence – p.13/18

The Universal AI Model

Task: Show the convergence of AI

to AI �Utilize the Solomonoff’s result generalized from

¯= 2 to an

arbitrary alphabet� sare pure spectators����� � ��� � � �� � �� � � � �� � � �� ��� �� � ��! � � �� �� �� � �� � # $t% '( )+* (, . � �

)/0 � 2 � � � �� �� �� � �� 3 � � �� ��� �� � ��

Outputs

° T W of the AI�

model converge to the outputs

° T W

of the AI � model atleast for the bounded horizon� W Z �� �� �A Theory of Universal Artificial Intelligence – p.14/18

Example Constants and Limits

1

± ² T W X W ³ � ´ µ d �·¶ � d1

¸ ¹ [º » ¹ \ ¼ ½ ´ ¹ ] \ ¾ ¹º ¿ ¿ ]º

(a) The agents interface is wide

(b) The interface can be sufficiently explored

(c) The death is far away

(d) Most input/output combinations do not occur

These limits are never used in proofs but we are only interested in theorems which do not degenerateunder the above limits

A Theory of Universal Artificial Intelligence – p.15/18

Application

Sequential Decision Theory (MDP)Bellman equation for optimal policy

n v / ��� w ux 0 w �À Á  À sÁ à v Ä �o à v / ��� Å / ��� 0 w �À Á  À sÁ à v Ä �

Apply the AI � model

ÆÈÇ ÉÊ Ë Ì ÍÏÎ Ð ÇÊ Î Ñ Ç Ò © Î Ó ÔÖÕ× Ç Ø É+Ù © Ú Ò>Û © Ù Û © Ò © Í

Ü Ç Ò Û © Ù Û © Î Ý É Ü Í Ç Þ ÉÙ ©+ß § ÍÏÎ à á É Ü Í Ç â á©+ß §äã å É Ò>Û © Ù Û © Í Ç Þ ÉÙ ©+ß § Í�æ â á© å É Ò>Û © Ù Û © Í

ç Ç Ò§¨ © Ù §¨ © Î Ý É ç Í Ç Þ É+Ù © ÍÏÎ à á É ç Í Ç â á© å É Ò§¨ © Ù §¨ © Í Ç Þ ÉÙ © Íæ â á©è § ã å É Ò§¨ © Ù §G¨ © Í

A Theory of Universal Artificial Intelligence – p.16/18

Application

ObservationsWe use the complete history as the environment state

The AI � model does not assumeMarkovian propertystationary environmentaccessible environment

Other applicationsGame Playing

Function Minimization

Supervised Learning

Bold Claim: AI � is the most general model

A Theory of Universal Artificial Intelligence – p.17/18

Conclusion

A parameterless model of AI based on Decision Theoryand Algorithmic Probability is presented

Makes minimal assumptions about the environment

Is the AI

model computable?

Future WorkDerive value and reward bounds for AI

model

Apply AI

model to more problem classes

A Theory of Universal Artificial Intelligence – p.18/18