🎲 Regression, Roulette, and Racehorses:

How We Trick Ourselves with Data (and Sometimes Catch It)

Jul 03, 2025

“All models are wrong, but some are useful. And some are just drunk.” — Probably not Karl Pearson

🧠 1. The Horse Race Inside the Regression

If you’ve ever run a regression, you’ve already been to a horse race—whether you knew it or not.

Each variable in your model is a horse, sprinting to explain the variance in your dependent variable. Some fall behind (low significance), some run in packs (multicollinearity), and some look fast until you test them out-of-sample.

Even when comparing whole models—say, does cost or demand better explain prices?—you’re running a model horse race. The winner? Whichever crosses the finish line with the lowest AIC or highest predictive power.

But just like real horse races, we’re often betting with biased odds, incomplete info, and far too much faith in our spreadsheets.

🎰 2. The Statistician Who Misread a Roulette Wheel

Back in the early 1900s, Karl Pearson—father of modern statistics—read a Monte Carlo newspaper that published roulette results. Ever the data hound, he ran tests and found that red and black weren’t coming up equally.

His conclusion? The wheel was rigged.

But the real problem was more human:

The journalist was lazy.

Turns out the wheel was fine. The data was biased—edited, cleaned, or simply made up. Pearson was doing solid stats on garbage input.

Lesson? Even brilliant minds can be misled by flawed data. We don’t just need good methods—we need to ask who’s telling the story.

🍺 3. The Brewer Who Invented a Revolution

While Pearson was misreading roulette, William Sealy Gosset was knee-deep in barley at Guinness. Tasked with maintaining beer quality from small samples, he needed a way to account for uncertainty when data was limited.

His solution? The Student’s t-distribution—published under a pseudonym because Guinness wouldn’t let employees publish under their real names.

It was a quietly radical idea:

You can make solid inferences even with tiny data… if you adjust for how wrong you might be.

Modern A/B testing, medical trials, even betting models owe him a toast.

🧪 4. Enter Russell and Popper: Science Gets a Soul

While statisticians were inventing tools, philosophers were fighting over how we should use them.

Bertrand Russell tried to rebuild knowledge on logical certainty—a world where clarity and deduction would protect us from error.
Karl Popper disagreed. He said the best science isn’t about confirming what we already think—it’s about trying to kill our own ideas, and seeing what survives.

If Pearson had listened to Popper, he might’ve said:

“Interesting result… but can I falsify this with better data?”
Instead, he bet on the story. And lost.

🐎 Regression as a Gambling Habit

We love our models. We tune them, praise them, run significance tests like they’re holy rituals. But just like gamblers at the track or casino, we’re often:

Reading too much into short streaks
Ignoring the role of chance
Mistaking noise for signal

And sometimes, like Pearson, we run clean regressions on dirty data, and walk away with the illusion of knowledge.

🔁 What’s the Point?

All of these stories—from beer and roulette to philosophy and horses—tell the same tale:

Science isn’t just math. It’s humility in the face of uncertainty.

Run your regression, but ask: What am I really measuring?
Bet on a model, but know the track conditions.
Learn from Gosset: Adjust for the unknown.
Remember Pearson: Even clean math can give you dirty answers.
And take a lesson from Popper: If your model can’t be wrong, it’s not really science.

🧭 Final Thought

The next time someone tells you their regression proves something—ask them which horse won, who fed the horses, and whether the race was even real.

Because sometimes, the most dangerous thing in statistics… is confidence.

Like this post? Share it with your data-loving, beer-drinking, gambler-philosopher friends. Or run a regression on who clicks. 🍺📊🐎

Would you like this exported to a blog format or set up as a real Substack draft?

Dennis’s Substack

Discussion about this post