chapter6 operant learning - cankaya.edu.tr learning.pdfburrhus frederic skinner! skinner on...

13
11/12/15 1 Chapter 6 – Operant Conditioning (Introduction) Operant Condititoning responsereinforcer R S R BehaviorConsequences B C 1 Do we always learn by associating neutral stimuli with other stimuli in environment? Importance of controlling learning, particularly complex, voluntary, goal-directed behavior . Also called instrumental conditioning because the response is instrumental in producing the consequence. Examples... 2 Examples The dog goes through FOOD the circle. (Behavior) (Consequence) the frequency of behavior increases! 3 Examples Using the vending Drinking coffee Machine (Behavior) (Consequence) behavior increases! 4

Upload: others

Post on 20-Jun-2021

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

1

Chapter 6 – Operant Condit ioning (Introduction)

Operant Condititoning

responseàreinforcer R à SR

BehavioràConsequences B à C

1

�  Do we always learn by associating neutral stimuli with other stimuli in environment? ¡  Importance of controlling learning, particularly complex,

voluntary, goal-directed behavior.

�  Also called instrumental conditioning because the response is instrumental in producing the consequence.

�  Examples...

2

Examples

The dog goes through FOOD

the circle. (Behavior) (Consequence) the frequency of behavior increases!

3

Examples

Using the vending Drinking coffee

Machine (Behavior) (Consequence) behavior increases!

4

Page 2: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

2

OC - Thorndike

http://www.youtube.com/watch?v=BDujDOLre-8

5

Learning Curve for Cats in Box 6

Thorndike’s Laws

�  Also called S-R learning. �  Law of effect – A chance act becomes a

learned behavior when a connection is formed between a stimulus (S) and a response (R) that is rewarded. ¡  If a response is followed by a satisfying state of

affairs, the strength of the connection is increased. If a response is followed by an annoying state of affairs, the strength of the connection is decreased.

7

Thorndike’s Laws

�  Law of exercise – the S-R connection is strengthened by use and weakened with disuse.

�  Law of readiness – motivation is needed to develop an association or display changed behavior.

1.  When someone is ready to perform some act, to do so is satisfying.

2.  When someone is ready to perform some act, not to do so is annoying.

3.  When someone is not ready to perform some act and is forced to do so, it is annoying.

8

Page 3: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

3

Thorndike’s Laws

�  Learning Is Incremental, Not Insightful. �  In other words, learning occurs in very small

systematic steps rather than in huge jumps.

9

Thorndike’s Laws 10

Thorndike’s Laws

�  Thorndike’s cats learned to solve the puzzle box problem (gradually/suddenly) ________.

�  According to Thorndike, behaviors that worked were st____ i__, while behaviors that did not work were st___ o__.

11

Burrhus Frederic Skinner

� Operant Conditioning: �  Learning that relies on associating behavior with its results or consequences. � Defined as “operant” – animal is operating on environment – not passive like CC. � Highlights importance of reinforcement &

punishment in learning.

12

Page 4: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

4

Burrhus Frederic Skinner

�  Type S and Type R Conditioning

¡  Type S: (Also called respondent conditioning) is equivalent to classical conditioning. Stimulus is known.

¡  Type R: (Also called operant conditioning). Behavior is

controlled by its consequences.

13

Burrhus Frederic Skinner

� Skinner on Reinforcement 1.  Any response that is followed by a reinforcing

stimulus tends to be repeated. 2.  A reinforcing stimulus is anything that increases the rate with which an operant response occurs.

14

Burrhus Frederic Skinner

� Skinner on Reinforcement ¡  The emphasis is on behavior and its consequences. ¡  This process exemplifies contingent reinforcement,

because getting the reinforcer is contingent (dependent) on the organism emitting a certain response.

15

Skinner

� To study this type of learning – needed to design controlled environment. ¡  Skinner Box

http://www.youtube.com/watch?v=wLLIoNsXgC0

16

Page 5: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

5

Skinner 17

�  The Cumulative Recording ¡  Time is recorded on the x-axis and total number of

responses is recorded on the y-axis. ¡  The cumulative recording never goes down. ¡  The rate with which the line ascends indicates the rate

of responding.

18

Cumulative Recording 19

�  In the original version of the Skinner box, rats earn food by p______a _________; in a later version, pigeons earn a few seconds of access to food by p_________ at an illuminated plastic disc known as a _______ ___________.

20

Page 6: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

6

Differences between OC and CC 21

Chapter 6 – Operant Condit ioning(continued)

Operant Condititoning

22

Operant Consequences: Reinforcers and Punishers

� 

a stimulus is a reinforcer if (1) it follows a behavior, and (2) the future probability of that behavior increases. Conversely, a stimulus is a punisher if (1) it follows a behavior, and (2) the future probability of that behavior decreases.

23

Operant Consequences: Reinforcers and Punishers

�  The terms reinforcement and punishment usually refer to the process or procedure by which a certain consequence changes the strength of a behavior.

�  Thus, the use of food to increase the strength of lever pressing is an example of reinforcement, while the food itself is a reinforcer.

�  We can use the terms reward and reinforcer interchangeably.

24

Page 7: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

7

Operant Consequences: Reinforcers and Punishers

�  More specifically, a reinforcer is a consequence that (precedes/follows) __a behavior and (increases/decreases) __ the probability of that behavior. A punisher is a consequence that (precedes/follows ___ a behavior and (increases/decreases) ___ the probability of that behavior.

�  Strengthening a roommate’s tendency toward cleanliness by thanking her when she cleans the bathroom is an example of ______, while the thanks itself is a _____.

�  When labeling an operant conditioning procedure, punishing consequences (punishers) are given the symbol ___ (which stands for ___ ___ ), while reinforcing consequences (reinforcers) are given the symbol ____ (which stands for ___ _____ ). The operant response is given the symbol _____.

25

Discriminative Stimuli

�  When a behavior is consistently reinforced or punished in the presence of certain stimuli, those stimuli will begin to influence the occurrence of the behavior.

�  If lever pressing produces food only when a tone is sounding, the rat soon learns to press the lever only when it hears the tone.

�  In the presence of the tone, if the rat presses the lever, it will receive food.

�  In the presence of discriminative stimuli, responses are reinforced and in the absence of it, they are not reinforced.

26

Discriminative Stimuli

�  A discriminative stimulus is a signal that indicates that a response will be followed by a reinforcer.

�  Discriminative stimuli do not elicit behavior in the manner

of a CS or US in classical conditioning. For example, the tone does not automatically elicit a lever press; it merely increases the probability that a lever press will occur.

27

Operant Conditioning

Discriminative Stimulus Behavior Consequence

Discriminative Stimuli

�  Three-term contingency. �  The three-term contingency can be viewed as

consisting of an antecedent event (an antecedent event is a preceding event), a behavior, and a consequence (which can be remembered by the initials ABC).

28

you notice something (tone), do something (press a lever), and get something (food).

Page 8: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

8

Discriminative Stimuli

discriminative stimulusàresponseàreinforcer SD à R à SR AntecedentàBehavioràConsequences A à B à C �  The discriminative stimulus can also signal not only

reinforcement but also the punishment.

�  Q: A discriminative stimuli (does/does not) _________ elicit behavior in the same manner as a CS.

29

Four Types of Contingencies 30

Four Types of Contingencies 31

Learn the terminology: “Reinforcement” always means strengthening behavior. “Punishment” always means decreasing behavior. “Positive” always means adding a stimulus. “Negative” always means removing a stimulus.

Four Types of Contingencies 32

Procedure After behavior occurs: Result:

Positive Reinforcement

Negative Reinforcement (escape or avoidance)

Positive Punishment (or just “punishment”) Negative Punishment (omission)

Page 9: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

9

Four Types of Contingencies 33

Procedure After behavior occurs: Result:

Positive Reinforcement Behavior increases

Negative Reinforcement (escape or avoidance)

Behavior increases

Positive Punishment (or just “punishment”) Negative Punishment (omission)

Four Types of Contingencies 34

Procedure After behavior occurs: Result:

Positive Reinforcement Present pleasant stimulus Behavior increases

Negative Reinforcement (escape or avoidance)

Behavior increases

Positive Punishment (or just “punishment”) Negative Punishment (omission)

Four Types of Contingencies 35

Procedure After behavior occurs: Result:

Positive Reinforcement Present pleasant stimulus Behavior increases

Negative Reinforcement (escape or avoidance)

Remove aversive stimulus Behavior increases

Positive Punishment (or just “punishment”) Negative Punishment (omission)

Four Types of Contingencies 36

Procedure After behavior occurs: Result:

Positive Reinforcement Present pleasant stimulus Behavior increases

Negative Reinforcement (escape or avoidance)

Remove aversive stimulus Behavior increases

Positive Punishment (or just “punishment”)

Behavior decreases

Negative Punishment (omission)

Behavior decreases

Page 10: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

10

Four Types of Contingencies 37

Procedure After behavior occurs: Result:

Positive Reinforcement Present pleasant stimulus Behavior increases

Negative Reinforcement (escape or avoidance)

Remove aversive stimulus Behavior increases

Positive Punishment (or just “punishment”)

Present aversive stimulus Behavior decreases

Negative Punishment (omission)

Behavior decreases

Four Types of Contingencies 38

Procedure After behavior occurs: Result:

Positive Reinforcement Present pleasant stimulus Behavior increases

Negative Reinforcement (escape or avoidance)

Remove aversive stimulus Behavior increases

Positive Punishment (or just “punishment”)

Present aversive stimulus Behavior decreases

Negative Punishment (omission)

Remove pleasant stimulus Behavior decreases

Positive reinforcement 39

Negative reinforcement

�  Negative means only that the behavior has resulted in something being removed or subtracted.

�  This is an example of reinforcement because the

behavior increases in strength; it is negative reinforcement because the consequence consists of taking something away.

40

Page 11: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

11

Negative reinforcement

�  Karen cries while saying to her boyfriend, “John, I don’t feel as though you love me.” John gives Karen a big hug saying, “That’s not true, dear, I love you very much.” If John’s hug is a reinforcer, Karen is (more/less) ___________likely to cry the next time she feels insecure about her relationship. More specifically, this is an example of _________ reinforcement of Karen’s crying behavior.

41

Positive punishment

�  Positive means only that the behavior has resulted in something being presented or added.

�  For example, if a rat received a shock when it pressed a lever, it would stop pressing the lever.

42

Negative punishment 43

It is punishment because the behavior decreases in strength, and it is negative punishment because the consequence consists of the removal of something.

Negative punishment 44

Jonathan’s behavior of talking to other women at parties has been negatively punished. What about the girlfriend’s behavior?

Negative punishment

Negative reinforcement

Page 12: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

12

Disadvantages of Using Punishment 45

�  Emotional effects �  Suppression of other behaviors �  Need for continual monitoring �  Attempts to escape situation �  Aggression against punisher or others

Examples…

�  When Sasha was teasing the dog, it bit her. As a result, she no longer teases the dog.The consequence for Sasha’s behavior of teasing the dog was the (presentation/removal) ___________ of a stimulus, and the teasing behavior subsequently(increased/decreased) _______ in frequency; therefore, this is an example of _____ _____.

�  When Alex held the car door open for Stephanie, she made a big fuss over what a gentleman he was becoming. Alex no longer holds the car door open for her. The consequence for holding open the door was the ______________ of a stimulus, and the behavior of holding open the door subsequently ______________ in frequency; therefore, this is an example of ______ __________.

46

Examples…

�  When Tenzing shared his toys with his brother, his mother stopped criticizing him. Tenzing now shares his toys with his brother quite often. The consequence for sharing the toys was the ________ of a stimulus, and the behavior of sharing the toys subsequently _______ in frequency; therefore, this is an example of ______ _______.

47

Primary and Secondary Reinforcers

�  A primary reinforcer (also called an unconditioned reinforcer) is an event that is innately reinforcing.

�  E.g. food, water, proper temperature (neither too hot nor too cold), and sexual contact.

�  associated with basic physiological needs. �  satisfies some biological need and works naturally,

regardless of a person’s prior experience. �  Secondary Reinforcer

¡  a stimulus that becomes reinforcing because of its association with a primary reinforcer.

¡  E.g. money, status, ‘good morning’, ‘well done’, ‘bravo’.

48

Page 13: chapter6 operant learning - cankaya.edu.tr learning.pdfBurrhus Frederic Skinner! Skinner on Reinforcement 1. Any response that is followed by a reinforcing stimulus tends to be repeated

11/12/15

13

Shaping: Reinforcing What Doesn’t Come Naturally

�  The process of teaching a complex behavior by rewarding closer and closer approximations of the desired behavior.

� E.g. teaching a children how to wear glasses. ¡ Putting the glasses on a table. ¡ Touching the glasses is followed by a reward. ¡ Playing with the glasses is followed by a reward. ¡ Wearing the glasses is followed by a reward. ¡ Keeping to wear the glasses is followed by a

reward. ¡  http://www.youtube.com/watch?v=_7kIV6zvAQY

49