Operant conditioning occurs when an association is made between a particular behavior and a consequence for that behavior. This association is built upon the use of reinforcement and/or punishment to encourage or discourage behavior. Operant conditioning was first defined and studied by behavioral psychologist B.F. Skinner, who conducted several well-known operant conditioning experiments with animal subjects.
Key Takeaways: Operant Conditioning
- Operant conditioning is the process of learning through reinforcement and punishment.
- In operant conditioning, behaviors are strengthened or weakened based on the consequences of that behavior.
- Operant conditioning was defined and studied by behavioral psychologist B.F. Skinner.
B.F. Skinner was a behaviorist, which means he believed that psychology should be limited to the study of observable behaviors. While other behaviorists, like John B. Watson, focused on classical conditioning, Skinner was more interested in the learning that happened through operant conditioning.
He observed that in classical conditioning responses tend to be triggered by innate reflexes that occur automatically. He called this kind of behavior respondent. He distinguished respondent behavior from operant behavior. Operant behavior was the term Skinner used to describe a behavior that is reinforced by the consequences that follow it. Those consequences play an important role in whether or not a behavior is performed again.
Skinner’s ideas were based on Edward Thorndike’s law of effect, which stated that behavior that elicits positive consequences will probably be repeated, while behavior that elicits negative consequences will probably not be repeated. Skinner introduced the concept of reinforcement into Thorndike’s ideas, specifying that behavior that is reinforced will probably be repeated (or strengthened).
To study operant conditioning, Skinner conducted experiments using a “Skinner Box,” a small box that had a lever at one end that would provide food or water when pressed. An animal, like a pigeon or rat, was placed in the box where it was free to move around. Eventually the animal would press the lever and be rewarded. Skinner found that this process resulted in the animal pressing the lever more frequently. Skinner would measure learning by tracking the rate of the animal’s responses when those responses were reinforced.
Reinforcement and Punishment
Through his experiments, Skinner identified the different kinds of reinforcement and punishment that encourage or discourage behavior.
Reinforcement that closely follows a behavior will encourage and strengthen that behavior. There are two types of reinforcement:
- Positive reinforcement occurs when a behavior results in a favorable outcome, e.g. a dog receiving a treat after obeying a command, or a student receiving a compliment from the teacher after behaving well in class. These techniques increase the likelihood that the individual will repeat the desired behavior in order to receive the reward again.
- Negative reinforcement occurs when a behavior results in the removal of an unfavorable experience, e.g. an experimenter ceasing to give a monkey electric shocks when the monkey presses a certain lever. In this case, the lever-pressing behavior is reinforced because the monkey will want to remove the unfavorable electric shocks again.
In addition, Skinner identified two different kinds of reinforcers.
- Primary reinforcers naturally reinforce behavior because they are innately desirable, e.g. food.
- Conditioned reinforcers reinforce behavior not because they are innately desirable, but because we learn to associate them with primary reinforcers. For example, Paper money is not innately desirable, but it can be used to acquire innately desirable goods, such as food and shelter.
Punishment is the opposite of reinforcement. When punishment follows a behavior, it discourages and weakens that behavior. There are two kinds of punishment.
- Positive punishment (or punishment by application) occurs when a behavior is followed by an unfavorable outcome, e.g. a parent spanking a child after the child uses a curse word.
- Negative punishment (or punishment by removal) occurs when a behavior leads to the removal of something favorable, e.g. a parent who denies a child their weekly allowance because the child has misbehaved.
Although punishment is still widely used, Skinner and many other researchers found that punishment is not always effective. Punishment can suppress a behavior for a time, but the undesired behavior tends to come back in the long run. Punishment can also have unwanted side effects. For example, a child who is punished by a teacher may become uncertain and fearful because they don’t know exactly what to do to avoid future punishments.
Instead of punishment, Skinner and others suggested reinforcing desired behaviors and ignoring unwanted behaviors. Reinforcement tells an individual what behavior is desired, while punishment only tells the individual what behavior isn’t desired.
Operant conditioning can lead to increasingly complex behaviors through shaping, also referred to as the “method of approximations.” Shaping happens in a step-by-step fashion as each part of a more intricate behavior is reinforced. Shaping starts by reinforcing the first part of the behavior. Once that piece of the behavior is mastered, reinforcement only happens when the second part of the behavior occurs. This pattern of reinforcement is continued until the entire behavior is mastered.
For example, when a child is taught to swim, she may initially be praised just for getting in the water. She is praised again when she learns to kick, and again when she learns specific arm strokes. Finally, she is praised for propelling herself through the water by performing a specific stroke and kicking at the same time. Through this process, an entire behavior has been shaped.
Schedules of Reinforcement
In the real world, behavior is not constantly reinforced. Skinner found that the frequency of reinforcement can impact how quickly and how successfully one learns a new behavior. He specified several reinforcement schedules, each with different timing and frequencies.
- Continuous reinforcement occurs when a particular response follows each and every performance of a given behavior. Learning happens rapidly with continuous reinforcement. However, if reinforcement is stopped, the behavior will quickly decline and ultimately stop altogether, which is referred to as extinction.
- Fixed-ratio schedules reward behavior after a specified number of responses. For example, a child may get a star after every fifth chore they complete. On this schedule, the response rate slows right after the reward is delivered.
- Variable-ratio schedules vary the number of behaviors required to get a reward. This schedule leads to a high rate of responses and is also hard to extinguish because its variability maintains the behavior. Slot machines use this kind of reinforcement schedule.
- Fixed-interval schedules provide a reward after a specific amount of time passes. Getting paid by the hour is one example of this kind of reinforcement schedule. Much like the fixed-ratio schedule, the response rate increases as the reward approaches but slows down right after the reward is received.
- Variable-interval schedules vary the amount of time between rewards. For example, a child who receives an allowance at various times during the week as long as they’ve exhibited some positive behaviors is on a variable-interval schedule. The child will continue to exhibit positive behavior in anticipation of eventually receiving their allowance.
Examples of Operant Conditioning
If you’ve ever trained a pet or taught a child, you have likely used operant conditioning in your own life. Operant conditioning is still frequently used in various real-world circumstances, including in the classroom and in therapeutic settings.
For example, a teacher might reinforce students doing their homework regularly by periodically giving pop quizzes that ask questions similar to recent homework assignments. Also, if a child throws a temper tantrum to get attention, the parent can ignore the behavior and then acknowledge the child again once the tantrum has ended.
Operant conditioning is also used in behavior modification, an approach to the treatment of numerous issues in adults and children, including phobias, anxiety, bedwetting, and many others. One way behavior modification can be implemented is through a token economy, in which desired behaviors are reinforced by tokens in the form of digital badges, buttons, chips, stickers, or other objects. Eventually these tokens can be exchanged for real rewards.
While operant conditioning can explain many behaviors and is still widely used, there are several criticisms of the process. First, operant conditioning is accused of being an incomplete explanation for learning because it neglects the role of biological and cognitive elements.
In addition, operant conditioning is reliant upon an authority figure to reinforce behavior and ignores the role of curiosity and an individual’s ability to make his or her own discoveries. Critics object to operant conditioning’s emphasis on controlling and manipulating behavior, arguing that they can lead to authoritarian practices. Skinner believed that environments naturally control behavior, however, and that people can choose to use that knowledge for good or ill.
Finally, because Skinner’s observations about operant conditioning relied on experiments with animals, he is criticized for extrapolating from his animal studies to make predictions about human behavior. Some psychologists believe this kind of generalization is flawed because humans and non-human animals are physically and cognitively different.