online learning in monkeys · very different from monkey behaviors. errors persv red 4 - triangle 4...

1
Abstract Online learning assumes that an adversary provides data sequentially to a learner, and can change the target concept at any time (called a concept drift) unbeknownst to the learner. This study is the first to address the connection between online learning in primates and learning algorithms. We examine online learning in the context of the Wisconsin Card Sorting Task (WCST), a task for which the concept acquisition strategies for human and other primates are well documented. We describe a new WCST experiment in rhesus monkeys, comparing the monkeys’ behaviors to that of online learning algorithms. Our expectation is that insights gained from this work and future research can lead to improved artificial learning systems. It appears that the monkeys are bad learners for this task. We suspect the primary cause for the “inefficiency” in the monkeys’ responses is due to the fact that they must first acquire the hypothesis space for the game, whereas the computer algorithm’s search space is predefined. The monkeys may not yet be actually playing the WCST as we understand it. Acknowledgement We thank Timothy Rogers for WCST discussions. Research supported in part by the Wisconsin Alumni Research Foundation, NIA grant P01 AG11915, NCRR grant P51 RR000167, and the University of Wisconsin School of Medicine and Public Health. Online Learning in Monkeys Xiaojin Zhu, Michael Coen, Shelley Prudom, Ricki Colman, and Joseph Kemnitz (University of Wisconsin-Madison) Contact: [email protected] The Wisconsin Card Sorting Task (WCST) In each trial, three shapes appear simultaneously on screen, each randomly combined with a different color. The target concept is initially Red. The monkey is rewarded with a food pellet for touching the red object regardless of its shape. Once the monkey has had 10 consecutive correct trials, a concept drift happens where the target concept is changed to Triangle without warning. The concept later shifts to Blue and then Star in a similar way. Monkeys (Macaca mulatta) Computers 0 200 400 600 800 1000 1200 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 level 1 level 2 level 3 level 4 trials WCST81010 0 200 400 600 800 1000 1200 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 level 1 level 2 level 3 level 4 level 3 level 4 trials WCST80088 0 100 200 300 400 500 600 700 800 900 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 level 1 level 2 level 1 level 2 level 3 level 4 trials WCST80160 0 500 1000 1500 2000 2500 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 level 1 level 2 level 1 level 2 level 3 level 4 trials WCST81092 0 200 400 600 800 1000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 level 2 level 3 level 2 level 3 level 4 trials WCST82057 0 100 200 300 400 500 600 700 800 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 level 2 level 1 level 2 level 3 level 4 trials WCST84076 0 200 400 600 800 1000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 level 1 level 2 level 3 level 4 trials WCST85014 accuracy (30-trial) reaction time (10 seconds) session 1. Monkeys adapt to concept drifts slowly on average around 300 trials. 2. Perseverative error dominates at 75%: A perseverative error is a mistake which would have been correct under the immediately preceding concept. 3. The monkeys do not slow down after a concept drift: Do they realize the change? trials errors persv Red 425 242 - Triangle 249 113 89 Blue 437 247 186 Star 279 132 94 Online disjunction learning algorithm Each object x has d=6 Boolean features (R,G,B,C,S,T). Adversary presents d/2 objects, learner picks one, adversary says yes/no Adversary can change the concept, i.e., concept drift. Want the number of mistakes not too larger than the number of concept drifts. Theorem: For any input sequence with m concept drifts, the algorithm makes at most (2m + 1)(d 1) mistakes. Specifically, the bound is 35 (m=3, d=6). Very different from monkey behaviors. errors persv Red 4 - Triangle 4 3 Blue 3 2 Star 2 1 An algorithm closer to monkey behaviors “slow”: skip learning (steps 3, 4) with probability a. “stubborn”: when h=0 in step 4, retain the old incorrect h with probability b. When a=0.93, b=0.96, this algorithm makes on average 563 errors, with 312 perservative errors (67%) out of 469 in T,B,S.

Upload: others

Post on 16-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Online Learning in Monkeys · Very different from monkey behaviors. errors persv Red 4 - Triangle 4 3 Blue 3 2 Star 2 1 An algorithm closer to monkey behaviors •“slow”: skip

AbstractOnline learning assumes that anadversary provides data sequentiallyto a learner, and can change the targetconcept at any time (called a conceptdrift) unbeknownst to the learner. Thisstudy is the first to address theconnection between online learning inprimates and learning algorithms.

We examine online learning in thecontext of the Wisconsin Card SortingTask (WCST), a task for which theconcept acquisition strategies forhuman and other primates are welldocumented. We describe a newWCST experiment in rhesus monkeys,comparing the monkeys’ behaviors tothat of online learning algorithms. Ourexpectation is that insights gainedfrom this work and future research canlead to improved artificial learningsystems.

It appears that the monkeys are badlearners for this task. We suspect theprimary cause for the “inefficiency” inthe monkeys’ responses is due to thefact that they must first acquire thehypothesis space for the game,whereas the computer algorithm’ssearch space is predefined. Themonkeys may not yet be actuallyplaying the WCST as we understand it.

Acknowledgement We thank Timothy

Rogers for WCST discussions. Research

supported in part by the Wisconsin Alumni

Research Foundation, NIA grant P01 AG11915,

NCRR grant P51 RR000167, and the

University of Wisconsin School of Medicine

and Public Health.

Online Learning in Monkeys

Xiaojin Zhu, Michael Coen, Shelley Prudom, Ricki Colman, and Joseph Kemnitz (University of Wisconsin-Madison)

Contact: [email protected]

The Wisconsin Card Sorting Task (WCST)

In each trial, three shapes appear simultaneously on screen, each randomly combined with a differentcolor. The target concept is initially Red. The monkey is rewarded with a food pellet for touching the redobject regardless of its shape. Once the monkey has had 10 consecutive correct trials, a concept drifthappens where the target concept is changed to Triangle without warning. The concept later shifts to Blueand then Star in a similar way.

Monkeys (Macaca mulatta) Computers

0 200 400 600 800 1000 12000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

level 1 level 2 level 3 level 4

trials

WCST81010

0 200 400 600 800 1000 12000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

level 1 level 2 level 3 level 4level 3 level 4

trials

WCST80088

0 100 200 300 400 500 600 700 800 9000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

level 1 level 2level 1level 2 level 3 level 4

trials

WCST80160

0 500 1000 1500 2000 25000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

level 1level 2level 1 level 2level 3 level 4

trials

WCST81092

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

level 2 level 3level 2level 3 level 4

trials

WCST82057

0 100 200 300 400 500 600 700 8000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

level 2level 1level 2 level 3 level 4

trials

WCST84076

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

level 1 level 2 level 3 level 4

trials

WCST85014

accuracy (30-trial) reaction time (10 seconds) session

1. Monkeys adapt to concept driftsslowly – on average around 300 trials.

2. Perseverative error dominates at 75%:A perseverative error is a mistake which

would have been correct under the

immediately preceding concept.

3. The monkeys do not slow down after a

concept drift: Do they realize the change?

trials errors persv

Red 425 242 -

Triangle 249 113 89

Blue 437 247 186

Star 279 132 94

Online disjunction learning algorithm• Each object x has d=6 Boolean features (R,G,B,C,S,T).• Adversary presents d/2 objects, learner picks one, adversary says yes/no• Adversary can change the concept, i.e., concept drift.• Want the number of mistakes not too larger than the number of concept drifts.

Theorem: For any input sequence withm concept drifts, the algorithm makes atmost (2m + 1)(d − 1) mistakes.

Specifically, the bound is 35 (m=3, d=6).Very different from monkey behaviors.

errors persv

Red 4 -

Triangle 4 3

Blue 3 2

Star 2 1

An algorithm closer to monkey behaviors• “slow”: skip learning (steps 3, 4) with probability a.• “stubborn”: when h=0 in step 4, retain the old incorrect h with probability b.When a=0.93, b=0.96, this algorithm makes on average 563 errors, with 312 perservative errors (67%) out of 469 in T,B,S.