psychology

50
Psychology BHP315111

Upload: ryu

Post on 23-Feb-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Psychology. BHP315111. MODULE 2 – Operant Conditioning. Operant Conditioning. While Classical Conditioning is important and useful, it can’t explain every learned behaviour. It can’t explain voluntary behaviour; behaviour that we can control. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Psychology

PsychologyBHP315111

Page 2: Psychology

MODULE 2 –

OperantConditioning

Page 3: Psychology

Operant Conditioning• While Classical Conditioning is important and useful, it

can’t explain every learned behaviour. It can’t explain voluntary behaviour; behaviour that we can control.

• Behaviours can occur without a stimulus – reading a book, playing tennis, robbing a bank.

• Much of our learning occurs by trial and error – if we do something and people like it, usually we do it again (wearing clothes – compliment. Wear again!)

Page 4: Psychology

Operant Conditioning• We all make adjustments to our behaviour according to the

outcomes or consequences it produces. Operant Conditioning is the learning that takes place as a result of these consequences.

• Trial and error learning is a part of operant conditioning, and is as its name suggests. It describes our attempts to learn, or to solve a problem, by trying alternative possibilities until a correct one is achieved. Once learned, the behaviour will usually be performed quickly and with fewer errors.

Page 5: Psychology

Trial and Error learning• Trial and error learning is also known as

instrumental learning, as the individual is instrumental in learning the correct response. More recently, both instrumental learning and trial and error learning have been referred to as operant conditioning.

• The individual operates on the environment to solve a problem. - iphone

Page 6: Psychology

Operant conditioning• Operant conditioning involves:– Motivation (a desire to attain a goal)

– Exploration (an increase in activity)

– Incorrect and correct responses

– Reward (the correct response is made and rewarded)

• Receiving a reward of some kind leads to the repeated performance of the correct response, strengthening the association between the behaviour and the outcome.

Page 7: Psychology

Thorndike• Edward Thorndike (1874 – 1949) carried out the first studies

of operant conditioning. He studied animal intelligence at the same time Pavlov studied his dogs.

• Thorndike put a hungry cat in a ‘puzzle box’ and placed a piece of fish outside the box where it could be seen/smelt, but it was just outside of reach. The cat had to learn to escape from the box by operating a latch to release a door on the side of the box. It had to push down on a paddle inside the box. Thorndike measured the time it took the cat to escape.

Page 8: Psychology

Thorndike’s cats• Firstly, the cat tried numerous ineffective strategies (trial and

error). It tried to squeeze through the bars or stretch its paws to reach the food. It clawed and bit the bars for ten minutes.

• Eventually it accidentally pushed the lever and the door opened. The cat was rewarded with both its release and the fish treat.

• When the cat was put back in the box it went through another series of incorrect responses before eventually pushing the lever again, and again being rewarded with the food.

Page 9: Psychology

Thorndike’s cats• The cat became progressively quicker at escaping from the box

and after about seven trials, it would go directly to the lever, push it, and get out immediately.

• Pushing the lever was no longer a random pattern of behaviour – it was a deliberate response that the cat had learnt due to the consequences of making the response. When the correct response was followed by a reward (escape and food), the cat demonstrated this behaviour with increasing frequency.

• This study was the first experiment that led to Thorndike calling this ‘trial and error learning’

Page 10: Psychology

Law of effectThis experiment led Thorndike to develop the law of effect. This states that a behaviour that is followed by ‘satisfying’ consequences is more likely to occur and a behaviour that is followed by ‘annoying’ consequences is less likely to occur.

http://www.youtube.com/watch?v=Vk6H7Ukp6To

Learning Activity 1 – pg 465 (Grivas)

Page 11: Psychology

Operant Conditioning• Thorndike definitely started this line of thinking, but

it didn’t become ‘operant conditioning’ until Burrhus Skinner came along. He referred to the responses observed in trial and error learning as operants. An operant is a response (or set of responses) that occurs and acts on the environment to produce some kind of effect. An operant is therefore a response of behaviour that generates consequences.

Page 12: Psychology

Operant conditioning• This type of conditioning is therefore based on the

principle that an organism will tend to repeat behaviours (operants) that have desirable consequences (treat), or that will enable it to avoid undesirable consequences (detention).

• Organisms will tend not to repeat behaviours that have undesirable consequences.

Page 13: Psychology

Skinner• Burrhys Frederic Skinner (B.F. Skinner) was inspired by

Thorndike. In the 1930s he began his own experiments and coined the term operant conditioning. He did this to show that organisms learn to operate on the environment to produce desired consequences.

• The cat’s behaviour had an effect on its environment – it opened the door. The fact that the consequences of the cat’s action were positive, increased the likelihood of the response happening again.

Page 14: Psychology

Respondent conditioning• A student might behave cooperatively if this behaviour

operates on the environment to produce a desired consequences (early dismissal). – Being cooperative = operant response. Conditioned by the

early dismissal.

• Skinner coined the term respondent conditioning for what we know as classical conditioning, as their behaviour doesn’t have any environmental consequences; the food simply comes!

Page 15: Psychology

Skinner’s beliefs…• Skinner believed that all behaviour could be explained

by the relationships between the behaviour, its antecedents (events that come before), and its consequences. Any behaviour that is followed by a consequences will change in strength and frequency, depending on the nature of that consequence.– Strength: become more, or less, established

– Frequency: Occur more, or less, often

– Consequence: Reward or punishment

Page 16: Psychology

A Skinner Box• Skinner created an apparatus called a Skinner box. In

this box, animals learn to make a particular response for which the consequences can be controlled by the researcher. There is lever that delivers food, lights and buzzers, and some with floors that give an electric shock.

• This was connected to a recorder to indicate how often each response is made (frequency) and the rate of the response (speed).

Page 17: Psychology

A Skinner Box• He mostly used rats, although later on he moved to pigeons.

Rats were conditioned to press the lever, pigeons were conditioned to peck at a disk.

• Rat would scurry around randomly and accidentally press the lever. After many repetitions, the rat became less random and eventually pressed the lever consistently. The rat was rewarded for the correct response.

• Skinner referred to different types of rewards as reinforcers.

• http://www.youtube.com/watch?v=jDLNHwquiAc

Page 18: Psychology

Reinforcement• When you are training your dog to shake hands and you

give them a biscuit, pat on the head or say ‘good dog’ when it behaves the way you want, you are using reinforcement.

• If you are using an umbrella to stop yourself from getting wet, that is another kind of reinforcement.

• Reinforcement may mean receiving a pleasant stimulus (biscuit) or escaping an unpleasant stimulus (avoiding getting wet).

Page 19: Psychology

Reinforcers• A reinforcer is an object or event that changes the

probability that an operant behaviour will occur again. ‘Reinforcer’ is often used interchangeably with ‘reward’.

• Reinforcers are only called ‘reinforcers’, if they actually reinforce behaviour. Eating chocolate is a pleasurable experience, but its only a reinforcer if it promotes or strengthens a particular response.

Page 20: Psychology

Schedules of reinforcement• Reinforcement can happen after every correct

response (continuous schedule), or only happen on some occasions where there is a correct response (partial reinforcement schedule).

• In the early stages of conditioning, learning is usually most rapid if the correct response is reinforced every time it occurs – continuous reinforcement.

Page 21: Psychology

Schedules of reinforcement• Once a correct response consistently occurs, a different

reinforcement schedule can be used to maintain the response – reinforce only some correct responses – partial reinforcement.

• Skinner ran out of pellets accidentally in his Skinner’s box

experiment, so he was forced to give pellets less often. It was found that this partial reinforcement schedule produces stronger responses and is less likely to weaken, than those maintained by continuous reinforcement.

Page 22: Psychology

Schedules of reinforcement• Schedule of reinforcement refers to the frequency

(how many times) and the manner in which a desired response is reinforced. – Reinforcement after a certain number of correct responses

(ratio), reinforcement after a certain amount of time has elapsed (interval)

– Reinforcement on a regular basis – every 6th time, every 30 seconds (fixed), reinforcement on an unpredictable rate (variable). Read pg 471-472 Grivas

Page 23: Psychology

Schedules of reinforcement• Fixed-ratio schedule – 1:10 – every 10 correct responses in

succession will equal a reinforcer

• Variable-ratio schedule – 1:10, then 1:5, then 1:12 – the reinforcement is given after a different number of responses, but always equates to a mean number (ratio)

• Fixed-interval schedule – Fixed period of time. First correct response after a period of 20 seconds.

• Variable-interval schedule – Irregular periods of time, but always equates to a mean period of time (30 seconds).

Page 24: Psychology

Positive reinforcement• Some examples:– The food pellet in Skinner’s box (a hungry rat)

– A in an exam (someone who studies conscientiously)

– A favourite book to read (a girl on a potty)

– Prize (competing at something)

• Provide a satisfying consequence (reward), increases the likelihood of a desired response.

• Doug Seus and Bart the Bear (ForTheGrizzly) http://www.youtube.com/watch?v=Af3G8aGk62U

Page 25: Psychology

Negative reinforcement• Some examples:– Umbrella (avoid wet clothes) panadol (avoid headache)

– Lever (avoid mild shock) turn off tv (avoid scary movie)

• A negative reinforcer is any unpleasant or aversive stimulus that, when removed, strengthens the likelihood of a desired response. The removal of something is the negative reinforcer. If there’s a chance of removing the unwanted behaviour, they are more likely to make the correct response.

Page 26: Psychology

Positive or negative reinforcement• The important distinction is:– Positive reinforcers are given

– Negative reinforcers are removed or avoided

• Both procedures lead to desirable outcomes and each procedure strengthens or reinforcers the behaviour that is desired.

Page 27: Psychology

Primary and Secondary • Primary reinforcers are things such as food, water or sex that

is satisfying and requires no learning on the part of the subject to become pleasurable.– Make yourself study for 2 hours before rewarding yourself

with chocolate (brain scans)

• Secondary reinforcers is any stimulus that has acquired its reinforcing power through experience – these are learned, by being paired with primary reinforcers or other secondary reinforcers.– Coupons, money, grades, praise.

Page 28: Psychology

Punishment - Positive• If you go faster than the speed limit = fine. This is

intended for you to reduce the speeding behaviour in the future. If you continue to speed = disqualification.

• This is an example of punishment of the unwanted behaviour with the intention of reducing or eliminating the behaviour.

• Punishment is the delivery of an unpleasant stimulus following a response (smack, fine, growl, slap, shock).

Page 29: Psychology

Response Cost – negative punish• Response cost is the removal of a pleasant stimulus

following a response (no iphone). This weakens or decreases the likelihood of an undesirable response recurring, by removing something pleasant.

• The difference?– Positive punishment: introduction of an unpleasant stimulus

following an undesirable response.

– Response cost: withdrawal of a pleasant stimulus following an undesirable response.

Page 30: Psychology

Reinforcement• Reinforcement is intended to increase the likelihood of a

behaviour being repeated and punishment is intended to decrease the likelihood of behaviour being repeated.

• In O.C. what happens after the desired response is performed is very important in terms of the strength of learning, and the rate of which it occurs. The time between the response and the consequence, as well as the appropriateness of the consequences used are important in determining the effectiveness of learning.

Page 31: Psychology

Order of presentation• To use reinforcement and punishment effectively, you

must always reinforce after the desired response occurs, NEVER before.

• If you reinforce someone’s use of the word ‘I’ in conversation with a smile, they are more likely to use it. If you smile before they say ‘I’, they are less likely to use it. Once you’ve reinforced this, if you remove your reinforcement (smile) they will make statements less often.

Page 32: Psychology

Timing• Reinforcement and punishment are most effective

when given immediately after the response has occurred. This helps to ensure that they associate the response with the reinforcer.

• It also influences the strength of the response. If there is a considerable delay, learning will generally be very slow to progress and may not even occur at all!

• Delay of some reinforcers – promise of it to come.

Page 33: Psychology

Appropriateness• For any stimulus to be a reinforcer, it must actually

provide a pleasing or satisfying consequence (reward). A free spot in a University course is not going to be a good reward for a student who wants to be a mechanic!

• Sometimes you can’t tell if it will be an appropriate reward until you have given it. It also can’t be assumed that what works in one situation will work in another.

Page 34: Psychology

Appropriateness• You also need to make sure the punishment is

appropriate. It must provide a consequence that is unpleasant and likely to decrease the undesirable behaviour.

• An inappropriate punishment can have the opposite effect – attention-starved Grade 8 student may talk in class and get spoken to verbally. This may make him act up more, as he has got the attention he wanted.

Page 35: Psychology

Key processes• The same key processes are in both classical and

operant conditioning– Acquisition

– Extinction

– Stimulus generalisation

– Stimulus discrimination

– Spontaneous recovery

• The do differ slightly in how they occur, however.

Page 36: Psychology

Acquisition• This is where the overall learning process is

established, where the specific response is established. The difference to classical conditioning is that behaviours are often more complex in operant conditioning, as classical conditioning often only deals in reflex, involuntary responses.

• The speed of which the response is established is dependant on what schedule of reinforcement is used.

Page 37: Psychology

Shaping• Shaping is a procedure where reinforcement is given for

any response that successively approximates and ultimately leads to the final desired response, or target behaviour.

• It’s also known as the method of successive approximations.

• Read p215 Plotnik

• http://www.youtube.com/watch?feature=player_embedded&v=teLoNYvOf90

Page 38: Psychology

Shaping• Skinner also did this with the pigeon to get it to do a 360 degree

turn. First he reinforced with a food pellet every time the pigeon moved slightly to the left. Once this response has been conditioned, Skinner would only reinforce when the pigeon turned a little further left, and so on until you get to a full turn.

• By limiting reinforcement only to those responses that gradually edged towards the target behaviour, Skinner could condition the pigeon to complete circles regularly.

• By using this method you never reinforce for previous behaviours.

Page 39: Psychology

Extinction• Extinction is the gradual decrease in the strength

or rate of a conditioned response following consistent non-reinforcement of the response.

• Extinction is said to have occurred when a conditioned response is no longer present.

• Basically, you stop reinforcing the behaviour. Stop giving pellets when they press the lever.

Page 40: Psychology

Extinction• This could be similar to partial reinforcement though?

• Extinction is less likely to occur when partial reinforcement is used. The uncertainty of the reinforcement leads to a greater tendency for the response to continue.

• This is why gambling is a hard addiction to break; the gambler is highly motivated to win, knows that there’s a chance of a big reward, and has an expectation that the reward will occur sooner or later.

Page 41: Psychology

Spontaneous Recovery• Extinction is not often permanent in O.C, as in C.C.

• After you think some behaviour has become extinct, spontaneous recovery can occur and the organism will once again show the response without any reinforcement. The response is likely to be weaker though, and wont last long.

• The longer after the extinction, the stronger the response, mostly.

Page 42: Psychology

Stimulus Generalisation• This occurs when the correct response is made to

another stimulus that is similar, but it usually occurs at a reduced level.

• Skinner’s pigeon pecked at different coloured lights, even though the original light it pecked at was green. This response, however, was less frequent, than the original behaviour.

• We do this in everyday life…starter’s gun (car backfire)

Page 43: Psychology

Stimulus Discrimination• Skinner also taught his pigeon to only peck at the green

light, not any other light. The light would change colours, and every time the green light came on, it would peck it and be reinforced, but ONLY for the green light.

• Sniffer dogs have to have stimulus discrimination. They use highly specialised O.C. with animals that already have a highly developed sense of smell (olfaction).

• When we change our behaviour for others, we are using stimulus discrimination – underwear, shyness, outgoing.

Page 44: Psychology

Learned helplessness• Martin Seligman completed a study to do with

learned helplessness. He harnessed dogs so that they couldn’t escape electric shocks.

• At first they whimpered, howled and tried to escape the shocks, but eventually they gave up and laid on the floor without struggling, showing human signs of depression (psychological stress responses).

Page 45: Psychology

Learned helplessness• The next day he placed the dogs in a shuttlebox where

they could easily escape, but they made no effort to escape and failed to learn even when they occasionally did escape.

• The dogs had come to expect that they could not get away; they had learned to be helpless.

• Learned helplessness consists of the expectancy that one cannot escape aversive events and the motivational and learning deficits that result from this belief.

Page 46: Psychology

Two-factor Learning• An ice-cream truck approaches with its bell ringing. A

boy named Justin hears the bell and thinks about ice-cream. As he does, his mouth waters. Justin runs to the truck, buys an ice cream and eats it.

• What kind of learning is this? – Both!

• In the real world, classical and operant are often inter-twined. This is called two-factor learning.

Page 47: Psychology

Two-factor Learning• Justin’s behaviour reflects both kinds of learning. – Classical conditioning – Justin’s salivates each time he

hears the bell

– Operant conditioning – Justin runs to the truck to buy and eat ice-cream (positive reinforcement)

• The differences is that Justin’s involuntary responses are altered by Classical conditioning, and Justin’s voluntary responses is shaped by Operant conditioning.

Page 49: Psychology

REVIEW!

• Trial and error learning• Instrumental learning• Operant Conditioning• Thorndike & Law of effect• Skinner’s box• Schedules of reinforcement• Positive reinforcement• Negative reinforcement• Punishment

• Order of presentation• Timing• Appropriateness

• Shaping• Acquisition & Extinction• Spontaneous recovery• Stimulus generalisation• Stimulus discrimination• Learned helplessness• Two-factor learning

Page 50: Psychology

References• Westen, D., Burton, L., Kowalski, R. (2006)

Psychology. Queensland, Australia: John Wiley & Sons Australia, Ltd.

• Cribb, B., Gridley, H., McKersie, C., Kennedy, G., Anin, N., Rice, J. (2004) Essential VCE Psychology. Cambridge, UK: Cambridge University Press.

• Plotnik, R. (2002) Introduction to Psychology. (6th ed.) CA, USA: Wadsworth Group.