How Reinforcement Schedules Work

Learning Process Used to Strengthen Specific Behaviors

Operant conditioning is a learning process in which new behaviors are acquired and modified through their association with consequences. Reinforcing a behavior increases the likelihood it will occur again in the future, while punishing a behavior decreases the likelihood that it will be repeated.

In operant conditioning, schedules of reinforcement are an important component of the learning process. When and how often we reinforce a behavior can have a dramatic impact on the strength and rate of the response.

Schedule of Reinforcement

A schedule of reinforcement is basically a rule stating which instances of a behavior will be reinforced. In some cases, a behavior might be reinforced every time it occurs. Sometimes, a behavior might not be reinforced at all.

Either positive reinforcement or negative reinforcement may be used a part of operant conditioning. In both cases, the goal of reinforcement is to strengthen a behavior so that it will likely occur again.

Reinforcement schedules take place in both naturally occurring learning situations as well as more structured training situations.

In real-world settings, behaviors are probably not going to be reinforced each and every time they occur. In situations where you are intentionally trying to reinforce a specific action (such as in school, sports, or in animal training), you would follow a specific reinforcement schedule.

Some schedules are better suited to certain types of training situations. In some cases, training might call for one schedule and then switch to another once the desired behavior has been taught.

The two foundational forms of reinforcement schedules are referred to as continuous reinforcement and partial reinforcement.

Continuous Reinforcement

In continuous reinforcement, the desired behavior is reinforced every single time it occurs. This schedule is best used during the initial stages of learning to create a strong association between the behavior and response.

Imagine, for example, that you are trying to teach a dog to shake your hand. During the initial stages of learning, you would stick to a continuous reinforcement schedule to teach and establish the behavior.

This might involve grabbing the dog's paw, shaking it, saying "shake," and then offering a reward each and every time you perform these steps. Eventually, the dog will start to perform the action on its own.

Continuous reinforcement schedules are most effective when trying to teach a new behavior. It denotes a pattern to which every narrowly-defined response is followed by a narrowly-defined consequence.

Partial Reinforcement

Once the response if firmly established, a continuous reinforcement schedule is usually switched to a partial reinforcement schedule.

In partial (or intermittent) reinforcement, the response is reinforced only part of the time. Learned behaviors are acquired more slowly with partial reinforcement, but the response is more resistant to extinction.

Think of the earlier example in which you were training a dog to shake and. While you initially used continuous reinforcement, reinforcing the behavior every time is simply unrealistic. In time, you would switch to a partial schedule to provide additional reinforcement once the behavior has been established or after considerable time has passed.

There are four schedules of partial reinforcement:

Fixed-Ratio Schedules

Fixed-ratio schedules are those in which a response is reinforced only after a specified number of responses. This schedule produces a high, steady rate of responding with only a brief pause after the delivery of the reinforcer.

An example of a fixed-ratio schedule would be delivering a food pellet to a rat after it presses a bar five times.

Variable-Ratio Schedules

Variable-ratio schedules occur when a response is reinforced after an unpredictable number of responses. This schedule creates a high steady rate of responding. Gambling and lottery games are good examples of a reward based on a variable ratio schedule.

In a lab setting, this might involve delivering food pellets to a rat after one bar press, again after four bar presses, and then again after two bar presses.

Fixed-Interval Schedules

Fixed-interval schedules are those where the first response is rewarded only after a specified amount of time has elapsed. This schedule causes high amounts of responding near the end of the interval but much slower responding immediately after the delivery of the reinforcer.

An example of this in a lab setting would be reinforcing a rat with a lab pellet for the first bar press after a 30-second interval has elapsed.

Variable-Interval Schedules

Variable-interval schedules occur when a response is rewarded after an unpredictable amount of time has passed. This schedule produces a slow, steady rate of response.

An example of this would be delivering a food pellet to a ​rat after the first bar press following a one-minute interval; a second pellet for the first response following a five-minute interval; and a third pellet for the first response following a three-minute interval.

Using the Appropriate Schedule

Deciding when to reinforce a behavior can depend on a number of factors. In cases where you are specifically trying to teach a new behavior, a continuous schedule is often a good choice. Once the behavior has been learned, switching to a partial schedule is often preferable.

In daily life, partial schedules of reinforcement occur much more frequently than do continuous ones. For example, imagine if you received a reward every time you showed up to work on time. Over time, instead of the reward being a positive reinforcement, the denial of the reward could be regarded as negative reinforcement.

Instead, rewards like these are usually doled out on a much less predictable partial reinforcement schedule. Not only are these much more realistic, but they also tend to produce higher response rates while being less susceptible to extinction.

Partial schedules reduce the risk of satiation once a behavior has been established. If a reward is given without end, the subject may stop performing the behavior if the reward is no longer wanted or needed.

For example, imagine that you are trying to teach a dog to sit. If you use food as a reward every time, the dog might stop performing once it is full. In such instances, something like praise or attention may be more effective in reinforcing an already-established behavior.

A Word From Verywell

Operant conditioning can be a powerful learning tool. The schedule of reinforcement utilized during training and maintenance process can have a major influence on how quickly a behavior is acquired, the strength of the response, and how frequently the behavior is displayed.

In order to determine which schedule is preferable, you need to consider different aspects of the situation, including the type of behavior that is being taught and the type of response that is desired.

Was this page helpful?

Article Sources