Tag: Mental Models

How to Use Occam’s Razor Without Getting Cut

Occam’s razor is one of the most useful, (yet misunderstood,) models in your mental toolbox to solve problems more quickly and efficiently. Here’s how to use it.

***

Occam’s razor (also known as the “law of parsimony”) is a problem-solving principle which serves as a useful mental model. A philosophical razor is a tool used to eliminate improbable options in a given situation. Occam’s is the best-known example.

Occam’s razor can be summarized as follows:

Among competing hypotheses, the one with the fewest assumptions should be selected.

The Basics

In simpler language, Occam’s razor states that the simplest explanation is preferable to one that is more complex. Simple theories are easier to verify. Simple solutions are easier to execute.

In other words, we should avoid looking for excessively complex solutions to a problem, and focus on what works given the circumstances. Occam’s razor can be used in a wide range of situations, as a means of making rapid decisions and establishing truths without empirical evidence. It works best as a mental model for making initial conclusions before the full scope of information can be obtained.

Science and math offer interesting lessons that demonstrate the value of simplicity. For example, the principle of minimum energy supports Occam’s razor. This facet of the second law of thermodynamics states that wherever possible, the use of energy is minimized. Physicists use Occam’s razor in the knowledge that they can rely on everything to use the minimum energy necessary to function. A ball at the top of a hill will roll down in order to be at the point of minimum potential energy. The same principle is present in biology. If a person repeats the same action on a regular basis in response to the same cue and reward, it will become a habit as the corresponding neural pathway is formed. From then on, their brain will use less energy to complete the same action.

The History of Occam’s Razor

The concept of Occam’s razor is credited to William of Ockham, a 14th-century friar, philosopher, and theologian. While he did not coin the term, his characteristic way of making deductions inspired other writers to develop the heuristic. Indeed, the concept of Occam’s razor is an ancient one. Aristotle produced the oldest known statement of the concept, saying, “We may assume the superiority, other things being equal, of the demonstration which derives from fewer postulates or hypotheses.”

Robert Grosseteste expanded on Aristotle’s writing in the 1200s, declaring

That is better and more valuable which requires fewer, other circumstances being equal…. For if one thing were demonstrated from many and another thing from fewer equally known premises, clearly that is better which is from fewer because it makes us know quickly, just as a universal demonstration is better than particular because it produces knowledge from fewer premises. Similarly, in natural science, in moral science, and in metaphysics the best is that which needs no premises and the better that which needs the fewer, other circumstances being equal.

Nowadays, Occam’s razor is an established mental model which can form a useful part of a latticework of knowledge.

Mental Model Occam's Razor

Examples of the Use of Occam’s Razor

The Development of Scientific Theories

Occam’s razor is frequently used by scientists, in particular for theoretical matters. The simpler a hypothesis is, the more easily it can be proven or falsified. A complex explanation for a phenomenon involves many factors which can be difficult to test or lead to issues with the repeatability of an experiment. As a consequence, the simplest solution which is consistent with the existing data is preferred. However, it is common for new data to allow hypotheses to become more complex over time. Scientists choose to opt for the simplest solution as the current data permits, while remaining open to the possibility of future research allowing for greater complexity.

The version used by scientists can best be summarized as:

When you have two competing theories that make exactly the same predictions, the simpler one is better.

The use of Occam’s razor in science is also a matter of practicality. Obtaining funding for simpler hypotheses tends to be easier, as they are often cheaper to prove.

Albert Einstein referred to Occam’s razor when developing his theory of special relativity. He formulated his own version: “It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.” Or, “Everything should be made as simple as possible, but not simpler.”

The physicist Stephen Hawking advocates for Occam’s razor in A Brief History of Time:

We could still imagine that there is a set of laws that determines events completely for some supernatural being, who could observe the present state of the universe without disturbing it. However, such models of the universe are not of much interest to us mortals. It seems better to employ the principle known as Occam’s razor and cut out all the features of the theory that cannot be observed.

Isaac Newton used Occam’s razor too when developing his theories. Newton stated: “We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.” He sought to make his theories, including the three laws of motion, as simple as possible, with only the necessary minimum of underlying assumptions.

Medicine

Modern doctors use a version of Occam’s razor, stating that they should look for the fewest possible causes to explain their patient’s multiple symptoms, and give preference to the most likely causes. A doctor we know often repeats the aphorism that “common things are common.” Interns are instructed, “when you hear hoofbeats, think horses, not zebras.” For example, a person displaying influenza-like symptoms during an epidemic would be considered more likely to be suffering from influenza than an alternative, rarer disease. Making minimal diagnoses reduces the risk of over-treating a patient, causing panic, or causing dangerous interactions between different treatments. This is of particular importance within the current medical model, where patients are likely to see numerous health specialists and communication between them can be poor.

Prison Abolition and Fair Punishment

Occam’s razor has long played a role in attitudes towards the punishment of crimes. In this context, it refers to the idea that people should be given the least punishment necessary for their crimes. This is to avoid the excessive penal practices which were popular in the past. For example, a 19th-century English convict could receive five years of hard labor for stealing a piece of food.

The concept of penal parsimony was pioneered by Jeremy Bentham, the founder of utilitarianism. He held that punishments should not cause more pain than they prevent. Life imprisonment for murder could be seen as justified in that it might prevent a great deal of potential pain, should the perpetrator offend again. On the other hand, long-term imprisonment of an impoverished person for stealing food causes substantial suffering without preventing any.

Bentham’s writings on the application of Occam’s razor to punishment led to the prison abolition movement and many modern ideas related to rehabilitation.

Exceptions and Issues

It is important to note that, like any mental model, Occam’s razor is not foolproof. Use it with care, lest you cut yourself. This is especially crucial when it comes to important or risky decisions. There are exceptions to any rule, and we should never blindly follow the results of applying a mental model which logic, experience, or empirical evidence contradict. When you hear hoofbeats behind you, in most cases you should think horses, not zebras—unless you are out on the African savannah.

Furthermore, simple is as simple does. A conclusion can’t rely just on its simplicity. It must be backed by empirical evidence. And when using Occam’s razor to make deductions, we must avoid falling prey to confirmation bias. In the case of the NASA moon landing conspiracy theory, for example, some people consider it simpler for the moon landing to have been faked, others for it to have been real. Lisa Randall best expressed the issues with the narrow application of Occam’s razor in her book, Dark Matter and the Dinosaurs: The Astounding Interconnectedness of the Universe:

Another concern about Occam’s Razor is just a matter of fact. The world is more complicated than any of us would have been likely to conceive. Some particles and properties don’t seem necessary to any physical processes that matter—at least according to what we’ve deduced so far. Yet they exist. Sometimes the simplest model just isn’t the correct one.

This is why it’s important to remember that opting for simpler explanations still requires work. They may be easier to falsify, but still require effort. And that the simpler explanation, although having a higher chance of being correct, is not always true.

Occam’s razor is not intended to be a substitute for critical thinking. It is merely a tool to help make that thinking more efficient. Harlan Coben has disputed many criticisms of Occam’s razor by stating that people fail to understand its exact purpose:

Most people oversimplify Occam’s razor to mean the simplest answer is usually correct. But the real meaning, what the Franciscan friar William of Ockham really wanted to emphasize, is that you shouldn’t complicate, that you shouldn’t “stack” a theory if a simpler explanation was at the ready. Pare it down. Prune the excess.

Remember, Occam’s razor is complemented by other mental models, including fundamental error distribution, Hanlon’s razor, confirmation bias, availability heuristic and hindsight bias. The nature of mental models is that they tend to all interlock and work best in conjunction.

Externalities: Why We Can Never Do “One Thing”

No action exists in a vacuum. There are ripples that have consequences that we can and can’t see. Here are the three types of externalities that can help us guide our actions so they don’t come back to bite us.

***

An externality affects someone without them agreeing to it. As with unintended consequences, externalities can be positive or negative. Understanding the types of externalities and the impact they have in our lives can help us improve our decision making, and how we interact with the world.

Externalities provide useful mental models for understanding complex systems. They show us that systems don’t exist in isolation from other systems. Externalities may affect uninvolved third parties which make them a form of market failure —an inefficient allocation of resources.

We both create and are subject to externalities. Most are very minor but compound over time. They can inflict numerous second-order effects. Someone reclines their seat on an airplane. They get the benefit of comfort. The person behind bears the cost of discomfort by having less space. One family member leaves their dirty dishes in the sink. They get the benefit of using the plate. Someone else bears the cost of washing it later. We can’t expect to interact with any system without repercussions. Over time, even minor externalities can cause significant strain in our lives and relationships.

The First Law of Ecology

To understand externalities it is first useful to consider second-order consequences. In Filters Against Folly, Garrett Hardin describes what he considers to be the First Law of Ecology: We can never do one thing. Whenever we interact with a system, we need to ask, “And then what? What will the wider repercussions of our actions be?” There is bound to be at least one externality.

Hardin gives the example of the Prohibition Amendment in the U.S. In 1920, lawmakers banned the production and sale of alcoholic beverages throughout the entire country. This was in response to an extended campaign by those who believed alcohol was evil. It wasn’t enough to restrict its consumption—it needed to go.

The addition of 61 words to the American Constitution changed the social and legal landscape for over a decade. Policymakers presumably thought they could make the change and people would stop drinking. But Prohibition led to numerous externalities. Alcohol is an important part of many people’s lives. Few were willing to suddenly give it up without a fight. The demand was more than strong enough to ensure a black-market supply re-emerged.

Wealthy people stockpiled alcohol in their homes before the ban went into effect. Thousands of speakeasies and gin joints flourished. Walgreens grew from 20 stores to 500, in large part due to its sales of ‘medicinal’ whiskey. Former alcohol producers simply sold the ingredients for people to make their own. Gangsters like Al Capone made their fortune smuggling, and murdered his rivals in the process. Crime gangs undermined official institutions. Tax revenues plummeted. People lost their jobs. Prisons became overcrowded and bribery commonplace. Thousands died from crime and drinking unsafe homemade alcohol.

Policymakers did not fully ask, “And then what?” before legislating. Drinking did decrease during this time, on average by about half.  But this was far from the hope of a total ban. The second-order consequences outweighed any benefits.

As economist Gregory Mankiw explains in Principles of Microeconomics,

In the presence of externalities, society’s interest in a market outcome extends beyond the well-being of buyers and sellers who participate in the market; it also includes the well-being of bystanders who are affected indirectly…. The market equilibrium is not efficient when there are externalities. That is, the equilibrium fails to maximize the total benefit to society as a whole.

Negative Externalities

Negative externalities can occur during the production or consumption of a service or good. Pollution is a useful example. If a factory pollutes nearby water supplies, it causes harm without incurring costs. The costs to society are high and are not reflected in the price of whatever the factory makes. Economists often view environmental damage as another factor in a production process. But even if pollution is taxed, the harmful effects don’t go away.

Transport and manufacturing release toxins into the environment, harming our health and altering our climate. The reality though, is these externalities are hard to see, and it is often difficult to trace them back to their root causes. There’s also the question of whether we are responsible for externalities or not.

Imagine you’re driving down the road. As you go by an apartment, the noise disturbs someone who didn’t agree to it. Your car emits air pollution, which affects everyone living nearby. Each of these small externalities will affect people you don’t see and who didn’t choose them. They won’t receive any compensation from you. Are you really responsible for the externalities you cause? If you’re not being outright careless or malicious, isn’t it just part of life? How much responsibility do we have as individuals, anyway?

Calling something a negative externality can be a convenient way of abdicating responsibility.

Positive Externalities

A positive externality imposes an unexpected benefit on a third party. The producer doesn’t agree to this, nor do they receive compensation for it.

Scientific research often leads to positive externalities. Research findings can have applications beyond their initial scope. The resulting information becomes part of our collective knowledge base. However, the researcher who makes a discovery cannot receive the full benefits. Nor do they necessarily feel entitled to them.

Blaise Pascal and Pierre de Fermat developed probability theory to solve a gambling dispute. Their work went on to inform numerous disciplines (like the field of calculus) and transform our understanding of the world. Probabilities are now a core part of how we think. Pascal and Fermat created a positive externality.

Someone who comes up with an equation cannot expect compensation each time it gets used. As a result, the incentives to invest the time and effort to discover new equations are reduced. Algorithms, patents, and copyright laws change this by allowing creators to protect and profit from their ideas for years before other people can freely use them. We all benefit, and researchers have an incentive to continue their work.

Network effects are an example of a positive externality. Silicon Valley understands this well. Each person who joins a network, like a marketplace app, increases the value to all other users. Those who own the network have an incentive improve it to encourage new users. Everyone benefits from being able to communicate with more people. While we might not join a new network intending to improve it for other people, that is what normally happens. (On the flipside, network effects can also produce negative externalities, as too many members can decrease the value of a network.)

Positive externalities often lead to the “free rider” problem. When we enjoy something that we aren’t paying for, we tend not to value it. Not paying can remove the incentive to look after a resource and leads to a Tragedy of the Commons situation. As Aristotle put it, “For that which is common to the greatest number has the least care bestowed upon it.” A good portion of online content succumbs to the free rider problem. We enjoy it and yet we don’t pay for it. We expect it to be free and yet, if users weren’t willing to support sites like Farnam Street, they would likely fold, start publishing lower quality articles, or sell readers to advertisers who collect their data. The end result, as we see too frequently, is low-quality content funded by page-view advertising. (This is why we have a membership program. Members of our learning community create a positive externality for non-members by helping support the free content.)

Positional Externalities

Positional externalities are a form of second-order effects. They occur when our decisions alter the context of future perception or value.

For example, consider what happens when a person decides to start staying at the office an hour late. Perhaps they want a promotion and think it will endear them to managers. Parkinson’s Law states that tasks expand to fit the time allocated to them. What this person would otherwise get done by 5pm, now takes until 6pm. Staying late becomes their norm. Their co-workers notice and start to also stay late. Before long, staying at the office until 6pm becomes the standard for everyone. Anyone who leaves at 5pm is perceived as lazy. Now that 6pm is the norm, everyone suffers. They are forced to work more without deriving any real benefits. It’s a lose-lose situation for everyone.

Someone we know once made an investment with a nearly unlimited return by gaming the system. He worked for an investment firm that valued employees according to a perception of how hard they worked and not necessarily by their results. Each Monday he brought in a series of sport coats and left them in the office. He paid the cleaning staff $20 a week to change the coat hanging on his chair and to turn on his computer. No matter what happened, it appeared he was always the first one into the office even though he often didn’t show up from a “client meeting” until 10. When it came to bonus time, he’d get an enormous return on that $20 investment.

Purchasing luxury goods can create positional externalities. Veblen goods are items we value because of their scarcity and high cost. Diamonds, Lamborghinis, tailor-made suits — owning them is a status symbol, and they lose their value if they become cheaper or if too many people have them. As Luca Lambertini puts it in The Economics of Vertically Differentiated Markets,

The utility derived from consumption is a function of the quantity purchased relative to the average of the society or the reference group to whom the consumer compares.” In other words, a shiny new car seems more valuable if all your friends are driving battered old wrecks. If they have equally (or more) fancy cars, the value of yours drops. At some point, it seems worthless and it’s time to find a new one. In this way, the purchase of a Veblen good confers a positional externality on other people who own it too.

That utility can also be a matter of comparison. A person earning $40,000 a year while their friends earn $30,000 will be happier than one earning $60,000 when their friends earn $70,000. When someone’s salary increases, it raises the bar, giving others a new point of reference.

We can confer positional externalities on ourselves by changing our attitudes. Let’s say someone enjoys wine but is not a connoisseur. A $10 bottle and a $100 bottle make them equally happy. When they decide to go on a course and learn the subtleties and technicalities of fine wines, they develop an appreciation for the $100 wine and a distaste for the $10. They may no longer be able to enjoy a cheap drink because they raised their standards.

Conclusion

Externalities are everywhere. It’s easy to ignore the impact of our decisions—to recline an airplane seat, to stay late at the office, or drop litter. Eventually though, someone always ends up paying. Like the villagers in Hardin’s Tragedy of the Commons, who end up with no grass for their animals, we run the risk of ruining a good thing if we don’t take care of it. Keeping the three types of externalities in mind is a useful way to make decisions that won’t come back to bite you. Whenever we interact with a system, we should remember to ask Hardin’s question: and then what?

Earning Your Stripes with Patrick Collison [The Knowledge Project #32]

On this episode of The Knowledge Project, Patrick Collison (@patrickc), CEO, and co-founder of Stripe shares wise insights on success, failure, management, decision making, learning and so much more. Grab a pen…

Patrick Collison


Subscribe on iTunes | Stitcher | Spotify | Android | Google Play

On this episode of the Knowledge Project, I chat with Patrick Collison, co-founder and CEO of the leading online payment processing company, Stripe. If you’ve purchased anything online recently, there’s a good chance that Stripe facilitated the transaction.

What is now an organization with over a thousand employees and handling billions of dollars of online purchases every year, began as a small side experiment while Patrick and his brother John were going to college.

During our conversation, Patrick shares the details of their unlikely journey and some of the hard-earned wisdom he picked up along the way. I hope you have something handy to write with because the nuggets per minute in this episode are off the charts. Patrick was so open and generous with his responses that I’m really excited for you to hear what he has to say.

Here are just a few of the things we cover:

  • The biggest (and most valuable) mistakes Patrick made in the early days of Stripe and how they helped him get better
  • The characteristics that Patrick looks for in a new hire to fit and contribute to the Stripe company culture
  • What compelled he and his brother to move forward with the early concept of Stripe, even though on paper it was doomed to fail from the start
  • The gaps Patrick saw in the market that dozens of other processing companies were missing — and how he capitalized on them
  • The lessons Patrick learned from scaling Stripe from two employees (he and his brother) to nearly 1,000 today
  • How he evaluates the upsides and potential dangers of speculative positions within the company
  • How his Irish upbringing influenced his ability to argue and disagree without taking offense (and how we can all be a little more “Irish”)
  • The power of finding the right peer group in your social and professional circles and how impactful and influential it can be in determining where you end up.
  • The 4 ways Patrick has modified his decision-making process over the last 5 years and how it’s helped him develop as a person and as a business leader (this part alone is worth the listen)
  • Patrick’s unique approach to books and how he chooses what he’s going to spend his time reading
  • …life in Silicon Valley, Baumol’s cost disease, and so, so much more.

Patrick truly is one of the warmest, humble and down to earth people I’ve had the pleasure to speak with and I thoroughly enjoyed our conversation together. I hope you will too!

Listen

Transcript

Normally only members of our learning community have access to transcripts, however, we pick one or two a year to make avilable to everyone. Here’s the complete transcript of the interview with Patrick.

If you liked this, check out other episodes of the knowledge project.

***

Members can discuss this podcast on the Learning Community Forum

Poker, Speeding Tickets, and Expected Value: Making Decisions in an Uncertain World

You can train your brain to think like CEOs, professional poker players, investors, and others who make tricky decisions in an uncertain world by weighing probabilities.

All decisions involve potential tradeoffs and opportunity costs. The question is, how can we make the best possible choices when the factors involved are often so complicated and confusing? How can we determine which statistics and metrics are worth paying attention to? How do we think about averages?

Expected value is one of the simplest tools you can use to think better. While not a natural way of thinking for most people, it instantly turns the world into shades of grey by forcing us to weigh probabilities and outcomes. Once we’ve mastered it, our decisions become supercharged. We know which risks to take, when to quit projects, and when to go all in.

“Take the probability of loss times the amount of possible loss from the probability of gain times the amount of possible gain. That is what we’re trying to do. It’s imperfect but that’s what it’s all about.”

— Warren Buffett

Expected value refers to the long-run average of a random variable.

If you flip a fair coin ten times, the heads-to-tails ratio will probably not be exactly equal. If you flip it one hundred times, the ratio will be closer to 50:50, though again not exactly. But for a huge number of iterations, you can expect heads to come up half the time and tails the other half. The law of large numbers dictates that the values will, in the long term, regress to the mean, even if the first few flips seem unequal.

The more coin flips, the closer you get to the 50:50 ratio. If you bet a sum of money on a coin flip, the potential winnings on a fair coin have to be bigger than your potential loss to make the expected value positive.

We make many expected-value calculations without even realizing it. If we decide to stay up late and have a few drinks on a Tuesday, we regard the expected value of an enjoyable evening as higher than the expected costs the following day. If we decide to always leave early for appointments, we weigh the expected value of being on time against the frequent instances when we arrive early. When we take on work, we view the expected value in terms of income and other career benefits as higher than the cost in terms of time and/or sanity.

Likewise, anyone who reads a lot knows that most books they choose will have minimal impact on them, while a few books will change their lives and be of tremendous value. Looking at the required time and money as an investment, books have a positive expected value (provided we choose them with care and make use of the lessons they teach).

These decisions might seem obvious. But the math behind them would be somewhat complicated if we tried to sit down and calculate it. Who pulls out a calculator before deciding whether to open a bottle of wine (certainly not me) or walk into a bookstore?

The factors involved are impossible to quantify in a non-subjective manner – like trying to explain how to catch a baseball. We just have a feel for them. This expected-value analysis is unconscious – something to consider if you have ever labeled yourself as “bad at math.”

Parking Tickets

Another example of the expected value is parking tickets. Let’s say that a parking spot costs $5, and the fine for not paying is $10. If you can expect to be caught one-third of the time, why pay for parking? The expected value of doing so is negative. It’s a disincentive. You can park without paying three times and pay only $10 in fines, instead of paying $15 for three parking spots. But if the fine is $100, the probability of getting caught would have to be higher than one in twenty for it to be worthwhile. This is why fines tend to seem excessive. They cover the people who are not caught while giving an incentive for everyone to pay.

Consider speeding tickets. Here, the expected value can be more abstract, encompassing different factors. If speeding on the way to work saves 15 minutes, then a monthly $100 fine might seem worthwhile to some people. For most of us, though, a weekly fine would mean that speeding has a negative expected value. Add in other disincentives (such as the loss of your driver’s license), and speeding is not worth it. So the calculation is not just financial; it takes into account other tradeoffs as well.

The same goes for free samples and trial periods on subscription services. Many companies (such as Graze, Blue Apron, and Amazon Prime) offer generous free trials. How can they afford to do this? Again, it comes down to expected value. The companies know how much the free trials cost them. They also know the probability of someone paying afterward and the lifetime value of a customer. Basic math reveals why free trials are profitable. Say that a free trial costs the company $10 per person, and one in ten people then sign up for the paid service, going on to generate $150 in profits. The expected value is positive. If only one in twenty people sign up, the company needs to find a cheaper free trial or scrap it.

Similarly, expected value applies to services that offer a free “lite” version (such as Buffer and Spotify). Doing so costs them a small amount or even nothing. Yet it increases the chance of someone’s deciding to pay for the premium version. For the expected value to be positive, the combined cost of the people who never upgrade needs to be lower than the profit from the people who do pay.

Lottery tickets prove useless when viewed through the lens of expected value. If a ticket costs $1 and there is a possibility of winning $500,000, it might seem as if the expected value of the ticket is positive. But it is almost always negative. If one million people purchase a ticket, the expected value is $0.50. That difference is the profit that lottery companies make. Only on sporadic occasions is the expected value positive, even though the probability of winning remains minuscule.

Failing to understand expected value is a common logical fallacy. Getting a grasp of it can help us to overcome many limitations and cognitive biases.

“Constantly thinking in expected value terms requires discipline and is somewhat unnatural. But the leading thinkers and practitioners from somewhat varied fields have converged on the same formula: focus not on the frequency of correctness, but on the magnitude of correctness.”

— Michael Mauboussin

Expected Value and Poker

Let’s look at poker. How do professional poker players manage to win large sums of money and hold impressive track records? Well, we can be certain that the answer isn’t all luck, although there is some of that involved.

Professional players rely on mathematical mental models that create order among random variables. Although these models are basic, it takes extensive experience to create the fingerspitzengefühl (“fingertips feeling,” or instinct) necessary to use them.

A player needs to make correct calculations every minute of a game with an automaton-like mindset. Emotions and distractions can corrupt the accuracy of raw math.

In a game of poker, the expected value is the average return on each dollar invested in the pot. Each time a player makes a bet or call, they are taking into account the probability of making more money than they invest. If a player is risking $100, with a 1 in 5 probability of success, the pot must contain at least $500 for the bet to be safe. The expected value per call is at least equal to the amount the player stands to lose. If the pot contains $300 and the probability is 1 in 5, the expected value is negative. The idea is that even if this tactic is unsuccessful at times, in the long run, the player will profit.

Expected-value analysis gives players a clear idea of probabilistic payoffs. Successful poker players can win millions one week, then make nothing or lose money the next, depending on the probability of winning. Even the best possible hands can lose due to simple probability. With each move, players also need to use Bayesian updating to adapt their calculations, because sticking with a prior figure could prove disastrous. Casinos make their fortunes from people who bet on situations with a negative expected value.

Expected Value and the Ludic Fallacy

In The Black Swan, Nassim Taleb explains the difference between everyday randomness and randomness in the context of a game or casino. Taleb coined the term “ludic fallacy” to refer to “the misuse of games to model real-life situations.” (Or, as the website logicallyfallacious.com puts it: the assumption that flawless statistical models apply to situations where they don’t actually apply.)

In Taleb’s words, gambling is “sterilized and domesticated uncertainty. In the casino, you know the rules, you can calculate the odds… ‘The casino is the only human venture I know where the probabilities are known, Gaussian (i.e., bell-curve), and almost computable.’ You cannot expect the casino to pay out a million times your bet, or to change the rules abruptly during the game….”

Games like poker have a defined, calculable expected value. That’s because we know the outcomes, the cards, and the math. Most decisions are more complicated. If you decide to bet $100 that it will rain tomorrow, the expected value of the wager is incalculable. The factors involved are too numerous and complex to compute. Relevant factors do exist; you are more likely to win the bet if you live in England than if you live in the Sahara, for example. But that doesn’t rule out Black Swan events, nor does it give you the neat probabilities which exist in games. In short, there is a key distinction between Knightian risks, which are computable because we have enough information to calculate the odds, and Knightian uncertainty, which is non-computable because we don’t have enough information to calculate odds accurately. (This distinction between risk and uncertainty is based on the writings of economist Frank Knight.) Poker falls into the former category. Real-life is in the latter. If we take the concept literally and only plan for the expected, we will run into some serious problems.

As Taleb writes in Fooled By Randomness:

Probability is not a mere computation of odds on the dice or more complicated variants; it is the acceptance of the lack of certainty in our knowledge and the development of methods for dealing with our ignorance. Outside of textbooks and casinos, probability almost never presents itself as a mathematical problem or a brain teaser. Mother nature does not tell you how many holes there are on the roulette table, nor does she deliver problems in a textbook way (in the real world one has to guess the problem more than the solution).

The Monte Carlo Fallacy

Even in the domesticated environment of a casino, probabilistic thinking can go awry if the principle of expected value is forgotten. This famously occurred in Monte Carlo Casino in 1913. A group of gamblers lost millions when the roulette table landed on black 26 times in a row. The probability of this occurring is no more or less likely than the other 67,108,863 possible permutations, but the people present kept thinking, “It has to be red next time.” They saw the likelihood of the wheel landing on red as higher each time it landed on black. In hindsight, what sense does that make? A roulette wheel does not remember the color it landed on last time. The likelihood of either outcome is exactly 50% with each spin, regardless of the previous iteration. So the potential winnings for each spin need to be at least twice the bet a player makes, or the expected value is negative.

“A lot of people start out with a 400-horsepower motor but only get 100 horsepower of output. It’s way better to have a 200-horsepower motor and get it all into output.”

— Warren Buffett

Given all the casinos and roulette tables in the world, the Monte Carlo incident had to happen at some point. Perhaps some day a roulette wheel will land on red 26 times in a row and the incident will repeat. The gamblers involved did not consider the negative expected value of each bet they made. We know this mistake as the Monte Carlo fallacy (or the “gambler’s fallacy” or “the fallacy of the maturity of chances”) – the assumption that prior independent outcomes influence future outcomes that are actually also independent. In other words, people assume that “a random process becomes less random and more predictable as it is repeated”1.

It’s a common error. People who play the lottery for years without success think that their chance of winning rises with each ticket, but the expected value is unchanged between iterations. Amos Tversky and Daniel Kahneman consider this kind of thinking a component of the representativeness heuristic, stating that the more we believe we control random events, the more likely we are to succumb to the Monte Carlo fallacy.

Magnitude over Frequency

Steven Crist, in his book Bet with the Best, offers an example of how an expected-value mindset can be applied. Consider a hypothetical race with four horses. If you’re trying to maximize return on investment, you might want to avoid the horse with a high likelihood of winning. Crist writes,

The point of this exercise is to illustrate that even a horse with a very high likelihood of winning can be either a very good or a very bad bet, and that the difference between the two is determined by only one thing: the odds.”2

Everything comes down to payoffs. A horse with a 50% chance of winning might be a good bet, but it depends on the payoff. The same holds for a 100-to-1 longshot. It’s not the frequency of winning but the magnitude of the win that matters.

Error Rates, Averages, and Variability

When Bill Gates walks into a room with 20 people, the average wealth per person in the room quickly goes beyond a billion dollars. It doesn’t matter if the 20 people are wealthy or not; Gates’s wealth is off the charts and distorts the results.

An old joke tells of the man who drowns in a river which is, on average, three feet deep. If you’re deciding to cross a river and can’t swim, the range of depths matters a heck of a lot more than the average depth.

The Use of Expected Value: How to Make Decisions in an Uncertain World

Thinking in terms of expected value requires discipline and practice. And yet, the top performers in almost any field think in terms of probabilities. While this isn’t natural for most of us, once you implement the discipline of the process, you’ll see the quality of your thinking and decisions improve.

In poker, players can predict the likelihood of a particular outcome. In the vast majority of cases, we cannot predict the future with anything approaching accuracy. So what use is the expected value outside gambling? It turns out, quite a lot. Recognizing how expected value works puts any of us at an advantage. We can mentally leap through various scenarios and understand how they affect outcomes.

Expected value takes into account wild deviations. Averages are useful, but they have limits, as the man who tried to cross the river discovered. When making predictions about the future, we need to consider the range of outcomes. The greater the possible variance from the average, the more our decisions should account for a wider range of outcomes.

There’s a saying in the design world: when you design for the average, you design for no one. Large deviations can mean more risk-which is not always a bad thing. So expected-value calculations take into account the deviations. If we can make decisions with a positive expected value and the lowest possible risk, we are open to large benefits.

Investors use expected value to make decisions. Choices with a positive expected value and minimal risk of losing money are wise. Even if some losses occur, the net gain should be positive over time. In investing, unlike in poker, the potential losses and gains cannot be calculated in exact terms. Expected-value analysis reveals opportunities that people who just use probabilistic thinking often miss. A trade with a low probability of success can still carry a high expected value. That’s why it is crucial to have a large number of robust mental models. As useful as probabilistic thinking can be, it has far more utility when combined with expected value.

Understanding expected value is also an effective way to overcome the sunk costs fallacy. Many of our decisions are based on non-recoverable past investments of time, money, or resources. These investments are irrelevant; we can’t recover them, so we shouldn’t factor them into new decisions. Sunk costs push us toward situations with a negative expected value. For example, consider a company that has invested considerable time and money in the development of a new product. As the launch date nears, they receive irrefutable evidence that the product will be a failure. Perhaps research shows that customers are disinterested, or a competitor launches a similar, better product. The sunk costs fallacy would lead them to release their product anyway. Even if they take a loss. Even if it damages their reputation. After all, why waste the money they spent developing the product? Here’s why: Because the product has a negative expected value, which will only worsen their losses. An escalation of commitment will only increase sunk costs.

When we try to justify a prior expense, calculating the expected value can prevent us from worsening the situation. The sunk costs fallacy robs us of our most precious resource: time. Each day we are faced with the choice between continuing and quitting numerous endeavors. Expected-value analysis reveals where we should continue, and where we should cut our losses and move on to a better use of time and resources. It’s an efficient way to work smarter, and not engage in unnecessary projects.

Thinking in terms of expected value will make you feel awkward when you first try it. That’s the hardest thing about it; you need to practice it a while before it becomes second nature. Once you get the hang of it, you’ll see that it’s valuable in almost every decision. That’s why the most rational people in the world constantly think about expected value. They’ve uncovered the key insight that the magnitude of correctness matters more than its frequency. And yet, human nature is such that we’re happier when we’re frequently right.

Footnotes
  • 1

    From https://rationalwiki.org/wiki/Gambler’s_fallacy, accessed on 11 January 2018.

  • 2

    Steven Crist, “Crist on Value,” in Andrew Beyer et al., Bet with the Best: All New Strategies From America’s Leading Handicappers (New York: Daily Racing Form Press, 2001), 63-64.

Understanding the Limitations of Maps

Maps are flawed but useful. For instance, we can leverage the experiences of others to help us navigate through territories that are, to us, new and unknown. We just have to understand and respect the inherent limitations of maps whose territories may have changed. We have to put some work into really seeing what the maps can show us. Here are three things you need to think about when using a map: The perspective, the author, and the territory.

The Perspective

Maps are an abstraction, which means information is lost in order to save space. So perhaps the most important thing we can do before reading a map is to stop and consider what choices have been made in the representation before us.

First, there are some limitations based on the medium used, like paper or digital, and the scale of the territory you are trying to represent. Take the solar system. Our maps of the solar system typically fit on one page. This makes them useful for understanding the order of the planets from the sun but does not even come close to conveying the size of the territory of space.

Bill Bryson explains in A Short History of Nearly Everything, “such are the distances, in fact, that it isn’t possible, in any practical terms, to draw the solar system to scale. … On a diagram of the solar system to scale, with the Earth reduced to about the diameter of a pea, Jupiter would be over a thousand feet away, and Pluto would be a mile and half distant (and about the size of a bacterium, so you wouldn’t be able to see it anyway).”

Maps are furthermore a single visual perspective chosen because you believe it the best one for what you are trying to communicate. This perspective is both literal — what I actually see from my eyes, and figurative — the bias that guides the choices I make.

It’s easy to understand how unique my perspective is. Someone standing three feet away from me is going to have a different perspective than I do. I’ve been totally amazed by the view out of my neighbour’s window.

Jerry Brotton, in his book A History of the World in Twelve Maps, reveals that “the problem of defining where the viewer stands in the relation to a map of the world is one geographers have struggled with for centuries.” Right from the beginning, your starting point becomes your frame of reference, the centre of understanding that everything else links back to.

In an example that should be a classic, but isn’t because of a legacy of visual representation that has yet to change, most of us seriously underestimate the size of Africa. Why? Because, as Tim Marshall explains in his book Prisoners of Geography, most of us use the standard Mercator world map, and “this, as do other maps, depicts a sphere on a flat surface and thus distorts shapes.” A world map always has to be distorted, with a bent toward the view you are trying to present. Which has led to a northern hemisphere centric vision of the world that has been burned into our brains.

Even though Africa looks roughly the size of Greenland, in fact, it is actually about 14 times larger. Don’t use the standard Mercator map to plan your hiking trip!

Knowing a map’s limitations in perspective points you to where you need to bring context. Consider this passage from Marshall’s book: “Africa’s coastline? Great beaches – really, really lovely beaches – but terrible natural harbors. Amazing rivers, but most of them are worthless for actually transporting anything, given that every few miles you go over a waterfall.”

A lot of maps wouldn’t show you this – the lines that are rivers are all drawn the same. So you’d look at the success the Europeans had with the Danube or the Rhine and think, why didn’t Africans think to use their rivers in the same way? And then maybe you decide to invest in an African mineral company, bringing to the table the brilliant idea of getting your products to market via river. And then they take you to the waterfalls.

The Author/Cartographer

Consider who draws the maps. A map of the modern day Middle East will probably tell you more about the British and French than any inhabitants of the region. In 1916 a British diplomat named Sykes and a French diplomat name Picot drew a line dividing the territory between their countries based on their interests in the region and not on the cultures of the people living there, or the physical formations that give it form.

Marshall explains, “The region’s very name is based on a European view of the world, and it is a European view of the region that shaped it. The Europeans used ink to draw lines on maps: they were lines that did not exist in reality and created some of the most artificial borders the world has seen. An attempt is now being made to redraw them in blood.”

The map creator is going to bring not only their understanding but also their biases and agenda. Even if your goal is to create the most accurate, unbiased map ever, that intent frames the decisions you make on what to represent and what to leave out. Our relatively new digital mapping makes a decision to respect some privacy at the outset and so Google doesn’t include images of people in its ‘streetview’.

Brotton argues that “a map always manages the reality it tries to show.” And as we have seen before, because there really isn’t one objective reality, maps need to be understood as portraying personal or cultural realities.

“No world map is, or can be, a definitive, transparent depiction of its subject that offers a disembodied eye onto the world.” All maps reflect our understanding of the territory at that moment in time. We change, and maps change with us.

The Territory

This leads to another pitfall. Get the right map. Or better yet, get multiple maps of the same territory. Different explorations require different maps. Don’t get comfortable with one and assume that’s going to explain everything you need. Change the angle.

Derek Hayes, in his Historical Atlas of Toronto, has put together a fascinating pictorial representation of the history of Toronto in maps. Sewer maps, transit maps, maps from before there were any roads, and planning maps for the future. Maps of buildings that were, and maps of buildings that are only dreams. Putting all these together starts to flesh out the context, allowing for an appreciation of a complex city versus a dot on a piece of paper. Maps may never be able to describe the whole territory, but the more you can combine them, the fewer blind spots you will have.

If you compare a map of American naval bases in 1947 with one from 1937, you would notice a huge discrepancy. The number increased significantly. Armed only with this map you might conclude that in addition to fighting in WWII, the Americans invested a lot of resources in base building during the 40s. But if you could get your hands on a map of British naval bases from 1937 you would conclude something entirely different.

As Marshall explains, “In the autumn of 1940, the British desperately needed more warships. The Americans had fifty to spare and so, with what was called the Destroyers for Bases Agreement, the British swapped their ability to be a global power for help in remaining in the war. Almost every British naval base in the Western Hemisphere was handed over.”

The message here is not to give up on maps. They can be wonderful and provide many useful insights. It is rather to understand their limitations. Each map carries the perspective of its creator and is limited by the medium it’s presented in. The more maps you have of a territory, the increased understanding you will have of the complexities of the terrain, allowing you to make better decisions as you navigate through it.

Footnotes
  • 1

    Image via NASA

Scientific Concepts We All Ought To Know

John Brockman’s online scientific roundtable Edge.org does something fantastic every year: It asks all of its contributors (hundreds of them) to answer one meaningful question. Questions like What Have You Changed Your Mind About? and What is Your Dangerous Idea?

This year’s was particularly awesome for our purposesWhat Scientific Term or Concept Ought To Be More Known?

The answers give us a window into over 200 brilliant minds, with the simple filtering mechanism that there’s something they know that we should probably know, too. We wanted to highlight a few of our favorites for you.

***

From Steven Pinker, a very interesting thought on The Second Law of Thermodynamics (Entropy). This reminded me of the central thesis of The Origin of Wealth by Eric Beinhocker. (Which we’ll cover in more depth in the future: We referenced his work in the past.)


The Second Law of Thermodynamics states that in an isolated system (one that is not taking in energy), entropy never decreases. (The First Law is that energy is conserved; the Third, that a temperature of absolute zero is unreachable.) Closed systems inexorably become less structured, less organized, less able to accomplish interesting and useful outcomes, until they slide into an equilibrium of gray, tepid, homogeneous monotony and stay there.

In its original formulation the Second Law referred to the process in which usable energy in the form of a difference in temperature between two bodies is dissipated as heat flows from the warmer to the cooler body. Once it was appreciated that heat is not an invisible fluid but the motion of molecules, a more general, statistical version of the Second Law took shape. Now order could be characterized in terms of the set of all microscopically distinct states of a system: Of all these states, the ones that we find useful make up a tiny sliver of the possibilities, while the disorderly or useless states make up the vast majority. It follows that any perturbation of the system, whether it is a random jiggling of its parts or a whack from the outside, will, by the laws of probability, nudge the system toward disorder or uselessness. If you walk away from a sand castle, it won’t be there tomorrow, because as the wind, waves, seagulls, and small children push the grains of sand around, they’re more likely to arrange them into one of the vast number of configurations that don’t look like a castle than into the tiny few that do.

The Second Law of Thermodynamics is acknowledged in everyday life, in sayings such as “Ashes to ashes,” “Things fall apart,” “Rust never sleeps,” “Shit happens,” You can’t unscramble an egg,” “What can go wrong will go wrong,” and (from the Texas lawmaker Sam Rayburn), “Any jackass can kick down a barn, but it takes a carpenter to build one.”

Scientists appreciate that the Second Law is far more than an explanation for everyday nuisances; it is a foundation of our understanding of the universe and our place in it. In 1915 the physicist Arthur Eddington wrote:

[…]

Why the awe for the Second Law? The Second Law defines the ultimate purpose of life, mind, and human striving: to deploy energy and information to fight back the tide of entropy and carve out refuges of beneficial order. An underappreciation of the inherent tendency toward disorder, and a failure to appreciate the precious niches of order we carve out, are a major source of human folly.

To start with, the Second Law implies that misfortune may be no one’s fault. The biggest breakthrough of the scientific revolution was to nullify the intuition that the universe is saturated with purpose: that everything happens for a reason. In this primitive understanding, when bad things happen—accidents, disease, famine—someone or something must have wanted them to happen. This in turn impels people to find a defendant, demon, scapegoat, or witch to punish. Galileo and Newton replaced this cosmic morality play with a clockwork universe in which events are caused by conditions in the present, not goals for the future. The Second Law deepens that discovery: Not only does the universe not care about our desires, but in the natural course of events it will appear to thwart them, because there are so many more ways for things to go wrong than to go right. Houses burn down, ships sink, battles are lost for the want of a horseshoe nail.

Poverty, too, needs no explanation. In a world governed by entropy and evolution, it is the default state of humankind. Matter does not just arrange itself into shelter or clothing, and living things do everything they can not to become our food. What needs to be explained is wealth. Yet most discussions of poverty consist of arguments about whom to blame for it.

More generally, an underappreciation of the Second Law lures people into seeing every unsolved social problem as a sign that their country is being driven off a cliff. It’s in the very nature of the universe that life has problems. But it’s better to figure out how to solve them—to apply information and energy to expand our refuge of beneficial order—than to start a conflagration and hope for the best.

Richard Nisbett (a social psychologist) has a great one — a concept we’ve hit on before but is totally underappreciated by most people: The Fundamental Attribution Error.

Modern scientific psychology insists that explanation of the behavior of humans always requires reference to the situation the person is in. The failure to do so sufficiently is known as the Fundamental Attribution Error. In Milgram’s famous obedience experiment, two-thirds of his subjects proved willing to deliver a great deal of electric shock to a pleasant-faced middle-aged man, well beyond the point where he became silent after begging them to stop on account of his heart condition. When I teach about this experiment to undergraduates, I’m quite sure I‘ve never convinced a single one that their best friend might have delivered that amount of shock to the kindly gentleman, let alone that they themselves might have done so. They are protected by their armor of virtue from such wicked behavior. No amount of explanation about the power of the unique situation into which Milgram’s subject was placed is sufficient to convince them that their armor could have been breached.

My students, and everyone else in Western society, are confident that people behave honestly because they have the virtue of honesty, conscientiously because they have the virtue of conscientiousness. (In general, non-Westerners are less susceptible to the fundamental attribution error, lacking as they do sufficient knowledge of Aristotle!) People are believed to behave in an open and friendly way because they have the trait of extroversion, in an aggressive way because they have the trait of hostility. When they observe a single instance of honest or extroverted behavior they are confident that, in a different situation, the person would behave in a similarly honest or extroverted way.

In actual fact, when large numbers of people are observed in a wide range of situations, the correlation for trait-related behavior runs about .20 or less. People think the correlation is around .80. In reality, seeing Carlos behave more honestly than Bill in a given situation increases the likelihood that he will behave more honestly in another situation from the chance level of 50 percent to the vicinity of 55-57. People think that if Carlos behaves more honestly than Bill in one situation the likelihood that he will behave more honestly than Bill in another situation is 80 percent!

How could we be so hopelessly miscalibrated? There are many reasons, but one of the most important is that we don’t normally get trait-related information in a form that facilitates comparison and calculation. I observe Carlos in one situation when he might display honesty or the lack of it, and then not in another for perhaps a few weeks or months. I observe Bill in a different situation tapping honesty and then not another for many months.

This implies that if people received behavioral data in such a form that many people are observed over the same time course in a given fixed situation, our calibration might be better. And indeed it is. People are quite well calibrated for abilities of various kinds, especially sports. The likelihood that Bill will score more points than Carlos in one basketball game given that he did in another is about 67 percent—and people think it’s about 67 percent.

Our susceptibility to the fundamental attribution error—overestimating the role of traits and underestimating the importance of situations—has implications for everything from how to select employees to how to teach moral behavior.

Cesar Hidalgo, author of what looks like an awesome book, Why Information Grows, wrote about Criticality, which is a very important and central concept to understanding complex systems:

In physics we say a system is in a critical state when it is ripe for a phase transition. Consider water turning into ice, or a cloud that is pregnant with rain. Both of these are examples of physical systems in a critical state.

The dynamics of criticality, however, are not very intuitive. Consider the abruptness of freezing water. For an outside observer, there is no difference between cold water and water that is just about to freeze. This is because water that is just about to freeze is still liquid. Yet, microscopically, cold water and water that is about to freeze are not the same.

When close to freezing, water is populated by gazillions of tiny ice crystals, crystals that are so small that water remains liquid. But this is water in a critical state, a state in which any additional freezing will result in these crystals touching each other, generating the solid mesh we know as ice. Yet, the ice crystals that formed during the transition are infinitesimal. They are just the last straw. So, freezing cannot be considered the result of these last crystals. They only represent the instability needed to trigger the transition; the real cause of the transition is the criticality of the state.

But why should anyone outside statistical physics care about criticality?

The reason is that history is full of individual narratives that maybe should be interpreted in terms of critical phenomena.

Did Rosa Parks start the civil rights movement? Or was the movement already running in the minds of those who had been promised equality and were instead handed discrimination? Was the collapse of Lehman Brothers an essential trigger for the Great Recession? Or was the financial system so critical that any disturbance could have made the trick?

As humans, we love individual narratives. We evolved to learn from stories and communicate almost exclusively in terms of them. But as Richard Feynman said repeatedly: The imagination of nature is often larger than that of man. So, maybe our obsession with individual narratives is nothing but a reflection of our limited imagination. Going forward we need to remember that systems often make individuals irrelevant. Just like none of your cells can claim to control your body, society also works in systemic ways.

So, the next time the house of cards collapses, remember to focus on why we were building a house of cards in the first place, instead of focusing on whether the last card was the queen of diamonds or a two of clubs.

The psychologist Adam Alter has another good one on a concept we all naturally miss from time to time, due to the structure of our mind. The Law of Small Numbers.

In 1832, a Prussian military analyst named Carl von Clausewitz explained that “three quarters of the factors on which action in war is based are wrapped in a fog of . . . uncertainty.” The best military commanders seemed to see through this “fog of war,” predicting how their opponents would behave on the basis of limited information. Sometimes, though, even the wisest generals made mistakes, divining a signal through the fog when no such signal existed. Often, their mistake was endorsing the law of small numbers—too readily concluding that the patterns they saw in a small sample of information would also hold for a much larger sample.

Both the Allies and Axis powers fell prey to the law of small numbers during World War II. In June 1944, Germany flew several raids on London. War experts plotted the position of each bomb as it fell, and noticed one cluster near Regent’s Park, and another along the banks of the Thames. This clustering concerned them, because it implied that the German military had designed a new bomb that was more accurate than any existing bomb. In fact, the Luftwaffe was dropping bombs randomly, aiming generally at the heart of London but not at any particular location over others. What the experts had seen were clusters that occur naturally through random processes—misleading noise masquerading as a useful signal.

That same month, German commanders made a similar mistake. Anticipating the raid later known as D-Day, they assumed the Allies would attack—but they weren’t sure precisely when. Combing old military records, a weather expert named Karl Sonntag noticed that the Allies had never launched a major attack when there was even a small chance of bad weather. Late May and much of June were forecast to be cloudy and rainy, which “acted like a tranquilizer all along the chain of German command,” according to Irish journalist Cornelius Ryan. “The various headquarters were quite confident that there would be no attack in the immediate future. . . . In each case conditions had varied, but meteorologists had noted that the Allies had never attempted a landing unless the prospects of favorable weather were almost certain.” The German command was mistaken, and on Tuesday, June 6, the Allied forces launched a devastating attack amidst strong winds and rain.

The British and German forces erred because they had taken a small sample of data too seriously: The British forces had mistaken the natural clustering that comes from relatively small samples of random data for a useful signal, while the German forces had mistaken an illusory pattern from a limited set of data for evidence of an ongoing, stable military policy. To illustrate their error, imagine a fair coin tossed three times. You’ll have a one-in-four chance of turning up a string of three heads or tails, which, if you make too much of that small sample, might lead you to conclude that the coin is biased to reveal one particular outcome all or almost all of the time. If you continue to toss the fair coin, say, a thousand times, you’re far more likely to turn up a distribution that approaches five hundred heads and five hundred tails. As the sample grows, your chance of turning up an unbroken string shrinks rapidly (to roughly one-in-sixteen after five tosses; one-in-five-hundred after ten tosses; and one-in-five-hundred-thousand after twenty tosses). A string is far better evidence of bias after twenty tosses than it is after three tosses—but if you succumb to the law of small numbers, you might draw sweeping conclusions from even tiny samples of data, just as the British and Germans did about their opponents’ tactics in World War II.

Of course, the law of small numbers applies to more than military tactics. It explains the rise of stereotypes (concluding that all people with a particular trait behave the same way); the dangers of relying on a single interview when deciding among job or college applicants (concluding that interview performance is a reliable guide to job or college performance at large); and the tendency to see short-term patterns in financial stock charts when in fact short-term stock movements almost never follow predictable patterns. The solution is to pay attention not just to the pattern of data, but also to how much data you have. Small samples aren’t just limited in value; they can be counterproductive because the stories they tell are often misleading.

There are many, many more worth reading. Here’s a great chance to build your multidisciplinary skill-set.