Persistence Is Not the Same as Precision
Why I avoid variable schedules in my positive reinforcement based training
Positive reinforcement based horse training is becoming more popular. I see that as a good thing. But growth brings problems. Any approach in its expansion phase will include a wide range of skill levels.
When more people adopt an approach, the tools often spread faster than the understanding. Food, markers, shaping - can be bolted on to existing systems or applied without much depth of knowledge. That doesn’t do horses any favours.
I see this play out in reinforcement schedules. The pressure to ‘move to variable’ or ‘thin your reinforcement’ is strong. New converts want to demonstrate progress. Confident theorists can quote the science but don’t have the practical depth to apply it well.
So people thin their reinforcement.
They move from continuous reinforcement to some form of variable reinforcement because they believe that is what progression demands. They stop reinforcing every correct response. They reinforce every few. Or only the very best ones. The gaps widen. Not because the behaviour is genuinely fluent, but because that is what they think good training looks like.
Out comes the slot machine analogy. Variable reinforcement creates persistence. Science says so.
Yes, maybe, in a Skinner box.
Skinner’s classic pigeon project (often referred to as the 300 peck pigeon) demonstrated that behaviour continues when reinforcement is unpredictable. What it didn’t demonstrate is that behaviour improves under those conditions. It’s not designed to. Variable schedules tend to stabilise (or reduce) what you have. They don’t refine it. And for several ethical reasons, I am not prepared to model my training on laboratory deprivation studies in boxes.
Coming off continuous reinforcement can create more effort and sometimes more intensity. Not more finesse. The horse is responding to uncertainty rather than problem solving.
Continuous reinforcement does something different. It keeps the feedback loop tight. Every correct response produces a clear outcome. That clarity allows you to sharpen details. To polish a behaviour.
With enough history of continuous reinforcement, behaviour becomes habitual. It becomes deeply embedded. At that point, you don’t need to rely on unpredictability to hold it together.
It makes better sense to use continuous reinforcement until a behaviour feels reliably habitual. At that time it can still be used for discrete behaviours that require precision. Microshaping sessions that keep them alive. Because the horse expects, and will receive, reinforcement, a slight delay can produce a subtle change in effort that can be shaped.
Progression is not about removing reinforcement as quickly as possible. Create a habit. Refine that habit. Thin reinforcement using chains and sequences.
So much more than that slot machine analogy.


