Partial Reinforcement

Table of Contents

Last Updated on December 19, 2022 by Mike Robinson

If you want to influence operant learning, you will need to know how patterns of reinforcement affect behavior. In operant learning,

Imagine, for example, that a mother wants to reward her child for turning off the lights when he leaves the room. Contrary to what you might think, it is better to reinforce only some of her son’s correct responses. Why should you do this? You’ll find the answer in the following discussion.

Until now, we have treated operant reinforcement as if it were continuous. Continuous reinforcement means that a reinforcer follows every correct response. At the start, continuous reinforcement helps learn new responses. To teach your dog to come to you, it is best to reinforce its behavior every time it comes when called.

Curiously, once your dog has learned to come when called, it is best to shift to partial reinforcement, in which reinforcers do not follow every response. Responses acquired by partial reinforcement are highly resistant to extinction, a phenomenon known as the partial reinforcement effect. (Domjan, 2010, Svartdal, 2003)

Definition of Partial Reinforcement

Unlike continuous reinforcement, where a reinforcer follows every correct response, partial reinforcement is a pattern in which you only reinforce a portion of all responses.

Partial Reinforcement Effect

How does getting reinforced part of the time make a habit stronger? If you have ever visited a casino, you have probably seen row after row of people playing slot machines. To gain insight into the distinction between continuous and partial reinforcement, imagine putting a dollar in a slot machine and pulling the handle. $10 spills into the tray. Let’s say this continues for several minutes. A payoff follows every pull. Because you are being reinforced on a continuous schedule, you quickly get hooked.

But suddenly, each pull is followed by nothing. You would respond several times more before giving up. However, when you use continuous reinforcement, the message quickly becomes clear: no more payoffs.

Contrast this with partial reinforcement. This time, imagine that you are just about to quit but decide to play once more. Bingo! The machine returns $20. After this, payoffs continue on a partial schedule. Some are large, and some are small. All are unpredictable. Sometimes you hit two in a row, and sometimes 20 or 30 pools go unrewarded.

Now let’s say the payoff mechanism is off again. How often would you respond this time before your handle-pulling behavior is extinguished? Because you have developed the expectation that any play might be “the one,” it will be hard to resist just one more play. Also, because partial reinforcement includes long periods of non-reward, it will be harder to distinguish between periods of reinforcement and extinction. It is no exaggeration to say that the partial reinforcement effect has left many people penniless.

Even psychologists who visit a casino may get cleaned out. To return to our previous example, after using continuous reinforcement to teach a child to turn off the lights or a dog to come when called, it is best to shift to partial reinforcement. That way, the new behavior will become more resistant to extinction.

Schedules of Partial Reinforcement

You can give partial reinforcement in several patterns or partial schedules of reinforcement. Let’s consider the four most basic, which have some interesting effects on us.

Fixed Ratio (FR)

What would happen if a reinforcer followed only every other response? Or what if we followed every third, fourth, fifth, or different number of responses with reinforcement?

Each of the patterns is a fixed ratio (FR) schedule: a set number of correct responses must occur to obtain a reinforcer. Notice that an FR schedule has a fixed ratio of reinforcers to responses: FR-2 means the subject gets a reward for every other response. FR-3 means that reinforcement occurs every third response. FR-10 means that ten responses must occur to obtain a reinforcer.

Fixed ratio schedules produce very high response rates. A hungry rat on an FR-10 schedule will quickly run off ten responses, pause to eat, and then run off ten more. A similar situation occurs when you pay factory or farm workers on a piece-rate basis. Production output is high when workers must produce a fixed number of items for a set amount of pay.

Variable Ratio (VR)

In a variable ratio schedule, a varied number of correct responses must occur to get a reinforcer. Instead of reinforcing every fourth response (FR-4), for example, a person or animal on a VR-4 schedule gets rewarded on average for every fourth response. Sometimes two responses must be made to obtain a reinforcer: sometimes five, sometimes four, and so on. The actual number varies but averages out to 4 in this example. Variable ratio schedules also produce high response rates. VR schedules seem less predictable than FR. Does that have any effect on extinction? Yes. Because reinforcement is less predictable, VR schedules tend to produce greater resistance to extinction than fixed ratio schedules. Playing a slot machine is an example of behavior maintained by a verbal ratio schedule. Another would be our plan to only occasionally reward a child for turning off the lights once he has learned to do so.

Fixed Interval (FI)

In another pattern, subjects get reinforcement only when making a correct response after a fixed amount of time has passed. This time interval is measured from the last reinforced response. On the AFI schedule, participants seem to develop a keen sense of the passage of time. Few responses occur just after delivering reinforcement, and a spurt of activity occurs just before the next reinforcement is due. Is getting paid weekly and on a FI schedule possible? Pure examples of fixed interval schedules are rare, but getting paid each week at work does come close. However, notice that most people do not work faster just before payday. as an FI schedule predicts.

Partial Reinforcement Examples

When professionals perform research on partial reinforcement in a laboratory setting, they would normally use animals, like rats and mice, and draw conclusions about human behavior based on the results of the tests performed on these animals. However, thousands of studies confirm the direct correlation between lab animals and humans when it comes to our behavior. This is true of partial reinforcement studies. Here are a couple of examples of partial reinforcement in human activities.

Examples of variable ratio reinforcement

Sports are where variable ratio reinforcement is commonplace.

Variable ratio reinforcement is present in golf, tennis, baseball, and many other sports. Even the best batters in baseball rarely get a hit more than an average of three out of every ten times they are at bat.

Example of fixed interval (FI)

A close example of a fixed interval is when we must complete projects for work or school. The school example would be like having a report due every two weeks before a class. Right after turning in a paper, your work would probably drop to zero for a week or more. Then you would ramp up and complete the task a few days before it is due.

Variable Intervals

Variable interval schedules are a variation on fixed intervals. Here, the subject gets reinforcement for the first correct response made after a varied amount of time on the AVI 32nd schedule; reinforcement is available after the interval that averages 30 seconds. The eyes produce slow, steady response rates and tremendous resistance to extinction. When you dial a phone number and get a busy signal, reward on a V-1 schedule, you may have to wait 30 seconds or 30 minutes. If you are like most people, you will doggedly dial over and over again until you get a connection. Success in fishing is also on a V-1 schedule, which may explain the bulldog tenacity of many anglers.