Behaviorism

Behaviorism itself is not just a way to study learning, it is a particular way to approach psychology. It promotes the scientific study of behavior in a way that avoid to take into account what is happening inside the “black box”, meaning the mental processes involved to produce the behavior.

It reached its peak of popularity during the first half of the twentieth century but is no longer a dominating research program. Yet it produced a large number of valuable experiments and principles that are especially relevant in learning to this day.

To understand behaviorism it is important to understand the key discovery that almost rooted it entirely. It was produced by Ivan Pavlov’s interest in salivation of dogs during the 1890s. He discovered that after a while, the dogs he used to feed would not only salivate when food was presented to them but also when any events they associated with food occurred in their environment (ex: the person that usually feeds them enters the room).

Pavlov designed multiple experiments to observe this phenomenon, the most famous one being with a bell and a dog :

To understand it we need to acknowledge the fact that some responses are not learned but are already “hard-wired” in every organism. So when a dog is presented something he perceives as food he does not ever need to learn to salivate. In this sense the behavior of salivating is an unconditioned response:

Unconditioned Stimulus (Food) > Unconditioned Response (Salivation)

What Pavlov discovered is that if you were to take a Neutral Stimulus (a stimulus that does not provoke an unconditioned response in the subject), for example, the ring of a bell, and present it near simultaneously to the dog as he received his food, he would start to associate both.

If you repeat this process multiple times, you will be able to ring the bell alone and observe salivation in the dog. In this case we would say that the dog has been conditioned to salivate when he hears the ring of the bell, the entire process is called classical conditioning (or Pavlovian conditioning).

Conditioned Stimulus (Ring of the bell) > Conditioned Response (Salivation)

Behaviorism produced another method of learning called operant conditioning (or sometimes called instrumental learning). It is a process used to modify behavior through the means of rewards and punishments.

It was first pioneered by Edward Thorndike (notably his law of effect 1898) and later popularized by B.F Skinner one the most famous ambassador of behaviorism.

Skinner usually conducted his experiments in what is now called a “Skinner Box” in which an animal, usually a rat, would be isolated with just a lever and a food dispenser.

When the rat moved around the box he would eventually press the lever and as a consequence receive food. After some iteration the rat learned to press the lever as soon as it wanted to receive more.

In this case the receival of food is a positive reinforcer (a reward) and the rat pressing the lever is a change of behavior also called a conditioned response.

We say the behavior “press the lever” has been strengthened.

Skinner designed many similar experiments and some included the possibility to administer punishments in the form of electric shocks and lead to many other discoveries:

Another way to strengthen a behavior could be to remove a punishment (unpleasant experience) associated with the absence of behavior.

Positive punishment is the weakening of a behavior by associating an unpleasant experience with it.

Negative punishment can also be used to weaken the frequency of a behavior by associating the removal of a reward with it.

When a punished behavior is no longer punished it returns from a certain level (it was never entirely forgotten).

Lastly he and Ferster designed even more diverse ways to deliver reinforcements and in particular they found a way to observe organisms’ response rate (the rate at which the rat pressed the lever) when different reward attribution system were employed. Basically a feedback (often a light or sound) would signal the subject that pressing the lever would be rewarded with food.

The light would either turn on after a predictable amount of time (fixed interval) or for other subjects after an unpredictable one (variable interval).

In another similar experiment the reward would be provided after a predictable number of actions, or in other words everytime the lever is pressed while the light is turned on (fixed ratio), or, for other subjects after an unpredictable number of actions (variable ratio).

The results can be a bit counterintuitive but are nonetheless correct. The overall response rate of the subject is greater when the perceived chance of receiving a reward is uncertain. The examples of this phenomenon in human behavior are everywhere we look and explains partly why gambling is so attractive to us.

We are just more likely to repeat a behavior when we have learned that a reward may or may not come by.

Rewards and punishment can quickly lead a student to produce specific behavioral responses. It is particularly relevant to teach facts and specific procedures. When there are good and bad answers in sum.

Student practice can be easily motivated by the means of reward acquisition, thus, increasing long-term retention of the skills.

Punishments are linked to an increase in anxiety and aggression in subjects which can eventually lead to negative consequences on learning and health (Galea 2015).

Behaviorism does not take into account the limitations of the students' memory, attention, and perception as it is not interested in the study of cognition.

Concerning its application to humans, as most researches were made on rodents, it has been argued that the physiological difference between animals prevented accurate generalization. The different responses observed on rats could vary more or less on humans (but make no mistake, we have much more commonalities with rats than differences).

Cannot teach high-level skills.

I Classical conditioning

II operant conditioning

III PRos

IV cons

V implementation examples

Do you remember being given points, badges or cards in class when you were young? Personally when I was learning to read, my teacher would give us little cards that we could accumulate and exchange against books or 15 additional minutes to play outside or sometimes even candies. Now I can't help but think that I was naively manipulated. Ok, I am exaggerating, but it is an obvious example of operant conditioning at play. When we were showing the expected behavior (for example reading a paragraph without making a mistake) we were given a reward, at this age we simply thought that we were getting what we wanted, but actually, teachers were carefully reinforcing the behaviors they wanted to see. Punishments were also a way to weaken other behaviors.

In games, behaviorist techniques are without a doubt the most widespread. I think there are multiple reasons for that.

First of all classical conditioning is a very efficient way for the system to quickly convey important informations to players. With time and repetition, we associate events in the game, for example it is almost intuitive to think that the death of our character is negative, then we learn to associate that death is a consequence of our life bar going down and a "gray screen" is a marker of a low life bar. In the end, the game conditions you to associate "gray screens" with the near negative outcome of death. It allows us to quickly recognize danger and to trigger the expression of our safer strategies, plus, we tend to not repeat the behaviors that lead us to this situation. This sort of tools, sometimes simply called feedback, are ways to accelerate learning toward optimal performances and the sense of accomplishment that comes with it.

All sorts of games now use this technique, building upon the associations players have made through their experiences with other games. I suggest you try to apply this visual treatment to your game while everything is going okay for your player and you should observe an increase in his or her fight-and-flight behavior.

Another example of classical conditioning in games is the secret sound effect of the Zelda franchise.

Players have learnt its association with the solving of puzzles or discovery of secrets throughout all their experience with Zelda games. It marks an achievement, a successful step toward an objective. If you add to this, the fact that the sound effect itself seems quite evocative of its function, it is no surprise that many of us simply need to hear it independently to feel good about ourselves.

It is important for game developers to take into account how people might have been conditioned in the past and how they are conditioning them now to fluidify the experience.

Secondly I would like to go deeper and say that, games are by nature, sorts of big "Skinner boxes". Games let you explore the worlds they are portraying in exchange for certain of your behaviors. Slowly they teach you the very specific sequences of buttons you should press on. Once you do what it expects you to, the story can continue, your character can become stronger, you have more abilities or sequences of buttons to experiment with, in other words you obtain rewards. With time, the game demands more and more complex sequences of behaviors in exchange, and you enjoy it. In addition to exploring a fiction, you feel like you are improving at something.

This works only if like me you like to explore worlds created by others or if you enjoy getting better at something just for the sake of it.

Here is another example of their "behaviorist nature", often, to talk to a character in a game there is a very specific way to proceed, you have to move your avatar to a certain distance from it and press a button, otherwise nothing happens and both characters stand there ignoring each other. The simple fact that the game reacts or not accordingly to the way you proceed is a reinforcer of this particular behavior.

That is partly because games are technically, and financially very demanding to produce, it can't possibly take into account all the ways someone can imagine starting a conversation, systems are designed in very specific ways to avoid all types of other issues (ex : if conversations started automatically each time you you were close enough you might start them involontarily).

Finally rewards and punishments are used to influence players in their decision making. Games have the ability to give a limited amount of agency to players, sometimes they have to choose what behavior to express and experience different consequences. These can be conceptualized as rewards and punishments that influence the player. If I do this, I can save John but I will loose the money, should I help this guy who gives me a boat or his opponent who will give me sword? If I do this that way I will obtain more money etc.

Rewards vary in ratio / interval, types and quality. For example, to encourage players to defeat enemies even when it is not mandatory, the game sometimes deliver very strong rewards afterwards, or none, or a very basic one. This calls to our natural attraction to "gambling" that we have seen upwards.

Dishonored 2 - Comparaison

Behaviorism

Behaviorism itself is not just a way to study learning, it is a particular way to approach psychology. It promotes the scientific study of behavior in a way that avoid to take into account what is happening inside the “black box”, meaning the mental processes involved to produce the behavior.

It reached its peak of popularity during the first half of the twentieth century but is no longer a dominating research program. Yet it produced a large number of valuable experiments and principles that are especially relevant in learning to this day.

​

​

​

​

Pavlov designed multiple experiments to observe this phenomenon, the most famous one being with a bell and a dog :

Unconditioned Stimulus (Food) > Unconditioned Response (Salivation)

What Pavlov discovered is that if you were to take a Neutral Stimulus (a stimulus that does not provoke an unconditioned response in the subject), for example, the ring of a bell, and present it near simultaneously to the dog as he received his food, he would start to associate both.

Conditioned Stimulus (Ring of the bell) > Conditioned Response (Salivation)

​

​

​

Behaviorism produced another method of learning called operant conditioning (or sometimes called instrumental learning). It is a process used to modify behavior through the means of rewards and punishments.

It was first pioneered by Edward Thorndike (notably his law of effect 1898) and later popularized by B.F Skinner one the most famous ambassador of behaviorism.

Skinner usually conducted his experiments in what is now called a “Skinner Box” in which an animal, usually a rat, would be isolated with just a lever and a food dispenser.

When the rat moved around the box he would eventually press the lever and as a consequence receive food. After some iteration the rat learned to press the lever as soon as it wanted to receive more.

In this case the receival of food is a positive reinforcer (a reward) and the rat pressing the lever is a change of behavior also called a conditioned response.

We say the behavior “press the lever” has been strengthened.

Skinner designed many similar experiments and some included the possibility to administer punishments in the form of electric shocks and lead to many other discoveries:

Another way to strengthen a behavior could be to remove a punishment (unpleasant experience) associated with the absence of behavior.

Positive punishment is the weakening of a behavior by associating an unpleasant experience with it.

​

Negative punishment can also be used to weaken the frequency of a behavior by associating the removal of a reward with it.

When a punished behavior is no longer punished it returns from a certain level (it was never entirely forgotten).

The light would either turn on after a predictable amount of time (fixed interval) or for other subjects after an unpredictable one (variable interval).

In another similar experiment the reward would be provided after a predictable number of actions, or in other words everytime the lever is pressed while the light is turned on (fixed ratio), or, for other subjects after an unpredictable number of actions (variable ratio).

We are just more likely to repeat a behavior when we have learned that a reward may or may not come by.

​

​

​

​

​

Rewards and punishment can quickly lead a student to produce specific behavioral responses. It is particularly relevant to teach facts and specific procedures. When there are good and bad answers in sum.

​

Student practice can be easily motivated by ​the means of reward acquisition, thus, increasing long-term retention of the skills.

​

​

​

​

Punishments are linked to an increase in anxiety and aggression in subjects which can eventually lead to negative consequences on learning and health (Galea 2015).

​

Behaviorism does not take into account the limitations of the students' memory, attention, and perception as it is not interested in the study of cognition.

​

​

Cannot teach high-level skills.

I Classical conditioning

II operant conditioning

III PRos

IV cons

V implementation examples

​

Further reading & references :

Student practice can be easily motivated by the means of reward acquisition, thus, increasing long-term retention of the skills.