Negative Reinforcement and the Positives it Can Have

Following last month’s article “Dissecting the Dance” on rewriting the dressage training scale, we look at other ways to use learning theory to create more equitable training environments for horses in all disciplines.

Horses, like most other animals including humans, do what works, and stop doing what doesn’t work. If you want to see a behaviour more often, make it work for your horse; if you want to see less of it, figure out how it is working and make it stop working.

Operant conditioning, first coined in 1938 in B.F. Skinner’s The Behavior of Organisms, is about that simple. However, Skinner and the learning theorists who followed him were not thinking about horse trainers when they made learning theory sufficiently complicated so as to be mostly inaccessible and irrelevant to the average horse person.

Scroll to continue with content

Undoubtedly, coaches are more likely to understand training principles in practical rather than scientific terms, and many have an instinctive and effective feel for training without a background in learning theory.

But equine welfare may be compromised when we substitute a sound understanding of learning principles with an assumption of horses’ innate sense of cooperation and comprehension, and a belief in the reciprocal benevolence of the human/horse relationship. A sounder grasp of learning science allows us to recognize why horses respond as they do, reduces equine behavioural problems, and provides for more positive horse-human interactions. In the July 2014 issue of Horse Sport I discussed how we might build this relationship through positive reinforcement training; here, I’ll turn the focus to negative reinforcement.

LEARNING THEORY 101

Reinforcement occurs when a response is strengthened because it leads to rewarding consequences. Much of the confusion about positive and negative reinforcement occurs when we think of these terms in the moral sense of good and bad: However, learning theorists are scientists, not moralists, and conceptualize these terms in their arithmetical sense of adding something and taking something away.

Positive reinforcement: (Something pleasant is added)

Positive reinforcement occurs when a desired behaviour is rewarded with something pleasurable, making it more likely that we will see the behaviour again. If, for instance, you want your horse to urinate for the drug tester on command, set up situations where it is likely to occur (a freshly bedded stall after a ride), provide a consistent verbal cue, and reward the horse with a treat whenever you catch him in the act. Eventually the verbal command and an inviting context should produce the desired result.

Negative Reinforcement: Something unpleasant is taken away

With negative reinforcement, we achieve the desired behaviour by taking away something unpleasant, making it more likely that we will see this behaviour occur in the future. The release from pressure or discomfort, the ending of something aversive, becomes the reward. For example, we apply pressure to the horse’s sides with our legs, the horse moves forward, and we ease the pressure. We tell the horse what behaviour we want by reinforcing it with the removal of the unpleasant event.

Negative reinforcement is a useful, effective (and in some cases the only) method for training a horse under saddle. Horses’ evolved tendency to avoid pressure (both physical and psychological) by moving away makes them exceptional candidates for negative reinforcement training.

Punishment: Something unpleasant is added (positive punishment), or something pleasant is taken away (negative punishment)

Punishment is distinct from negative reinforcement, which involves the removal of an unpleasant stimulus (thereby strengthening a behaviour); while punishment (both positive and negative) weakens a behaviour. Positive punishment involves the addition of an unpleasant stimulus to a behaviour, thereby reducing the likelihood it will occur again (such as whipping a horse for a refusal). Negative punishment involves removing a pleasant or desirable stimulus to weaken a behaviour (such as walking away with the grain cart from a horse that is kicking in impatience.)

CLASSICAL CONDITIONING

Although negative reinforcement is a mainstay of equine training, the optimal training goal would be to never have to bring about an unpleasant event after training is complete. This is entirely possible using another learning theory principle – classical conditioning.

Classical Conditioning: Something neutral is paired with something pleasant

In classical conditioning, something that was previously a neutral stimulus (neither pleasant nor unpleasant) is paired with something that has inherent value for survival (e.g. food, sex, affiliation, etc.). Consequently, the once-neutral stimulus takes on the value of the stimulus with which it was paired.

For example, infants do not innately see money as reinforcing, but children and adults learn through repeated paired associations that money is connected to good things (candy, rides, cool toys, fast cars, Warmbloods, and horse shows in West Palm Beach). The neutral stimulus of money has now become so strongly associated with what it can buy that money becomes rewarding in and of itself. Right now, close your eyes, and imagine a $100 bill floating from the sky and falling into your lap. Rather pleasant, isn’t it?

Once a horse understands a desired response through negative reinforcement, we can use classical conditioning to ultimately eliminate the aversive signals. This transition from negative reinforcement to classical conditioning is not new. A trained dressage horse will know that a tightening of the rider’s abdominal muscles before applying the leg is an indication to collect the stride. The horse was not born knowing this. He was initially trained through pressure of leg (go forward) and rein (slow down) to shorten and elevate his stride, and the momentary release of that pressure told him he had offered the correct response – i.e. negative reinforcement. The rider gave the abs tightening aid just prior to the release reward such that these two things became associated or conditioned. From the horse’s point of view, abs tightening means to collect and the pressure will ease. Eventually, the abs aid is all that is necessary – i.e. the aid has been classically conditioned. But why do we so seldom see this understated and
harmonious picture?

We fail at the first part: Negative Reinforcement is hard

The pressure/release formula of negative reinforcement works well only with impeccable timing. Poorly-executed negative reinforcement punishes the horse by not releasing pressure when he has offered the desired response. Remember that punishment makes a response less likely to reoccur, because it is followed by an aversive event.

Unfortunately we well-meaning amateurs are often guilty of this inadvertent punishment because our timing, coordination, or experience compromises our ability to release and reward appropriately. Consider the horse who makes such a tremendous jumping effort that he unbalances his rider who subsequently yanks the horse in his mouth. Through several reoccurrences the rider effectively trains the horse, through punishment, to avoid jumping altogether. However, the horse is also punished for refusing to jump, leaving him with no correct answer to the question.

Professionals are also culpable when the desire to achieve a competition goal overrides the common good sense of knowing when to stop. Note that the release of pressure is the reward alerting the horse that he has offered the correct response. When there is no release, we teach him, through punishment rather than negative reinforcement, precisely the opposite response to that intended – i.e. to not offer this behaviour in the future.

Equine scientist Andrew MacLean (see “Dissecting the Dance” in the August 2015 issue) posits that the incorrect use of negative reinforcement is responsible for most training failures. European studies of over 3,000 horses sent to slaughter indicate that as many as 66% were sent there for “inappropriate behaviour.”

We forget: Negative reinforcement is not the end game

All too often we see negative reinforcement as the final destination, rather than a transition to achieving the desired movement with a neutral aid. For example, few of us would question applying bit pressure to cue the horse to slow down, collect, or stop. How much more comfortable for the horse to condition this response, so that the aversive aid could be supplanted with a neutral one – a touch of the wither for instance. Indeed, the sight of a bridleless horse is so singularly stunning, because bit/rein/hand connection is so inextricably linked in our minds to control. We simply fail to consider that the aversive aid could be eliminated.

New advances in smart textile applications and tension sensors such as the Rein Tension Device allow riders and trainers to know precisely how much pressure riders exert on the reins and thereby aid them in reducing it. Similar measuring devices currently being incorporated into half-chaps and saddle pads may eventually be of enormous benefit to coaches and riders in applying aids that are increasingly subtle.

We send mixed messages: “Learned helplessness”

Horses are further compromised when riders give conflicting aids – two opposing signals such as driving forward with seat and legs and simultaneously applying rein pressure. Conflicting aids often result in horses seeking to avoid pain with active coping mechanisms such as bucking, bolting, or rearing.

If painful contradictory signals occur over a prolonged period, horses (like humans and other animals) may experience “learned helplessness,” a term coined by psychological researcher Martin Seligman in 1978. Seligman trained research dogs to jump over a barrier to avoid an electrically-charged floor. When Seligman introduced the shock to both sides of the barrier so that no amount of jumping would divert the pain, the dogs eventually ceased trying and lay down on the charged floor. Even when Seligman reintroduced the escape route, the dogs’ remained passive, although physiological measures indicated high stress levels.

MacLean argues that horses put in a position of continual conflicting aids, where no behaviour will terminate the pressure, will eventually fall victim to this state of learned helplessness. What trainers may mistake as submission, obedience, and successful training may well be a horse who has given up trying to escape relentless conflicting pressure.

HOW SCIENCE CAN TURN IT AROUND

Members of the International Society of Equitation Science (www.equitationscience.com), whose mandate is to educate horse enthusiasts about the application of learning theory to improve equine welfare, believe that fundamental learning principles need to replace our current thinking about horse behaviour. Notions of the benevolent horse (agreeable and courageous, working with his rider toward a common goal) or the alternatively malevolent horse (lazy, spiteful, harbouring a poor work ethic) attribute much higher cognitive strategies to horses than they probably possess, create expectations they cannot fulfill, and inevitably result in negative outcomes for horses.

Replacing sentiment with science meets formidable resistance from amateurs and professionals who argue that science is too mechanical, too rule-bound to capture the nuanced, individual differences of the human/horse interaction. I would contend, however, that by taking advantage of what science has taught us about animal learning, we create a far more equitable environment for horses. Conflict behaviours need not arise, horses’ physical and psychological well-being can be enhanced, and we create a safer environment for horses and handlers where that intense and ethereal bond that we share with our horses can flourish.