Tag: Probability

How the Many Sides to Every Story Shape our Reality

“We can select truths that engage people and inspire action, or we can deploy truths that deliberately mislead. Truth comes in many forms, and experienced communicators can exploit its variability to shape our impression of reality.”

***

The truth is not as straightforward as it seems. There are many truths, some of them more honest than others. “On most issues,” writes Hector Macdonald in his book Truth: How the Many Sides to Every Story Shape Our Reality, “there are multiple truths we can choose to communicate. Our choice of truth will influence how those around us perceive an issue and react to it.”

We are often left with several truths, some more flattering to us than others. What we choose to see, and what we share with others, says a lot about who we are.

“There is no worse lie than a truth misunderstood by those who hear it.”

— William James

Competing Truths

According to MacDonald, there are often many legitimate ways of describing a situation. Of course, it’s possible for anyone to cherry-pick the facts or truths they prefer, shaping the story to meet their needs. MacDonald offers an apt demonstration.

A few years ago, I was asked to support a transformation programme at a global corporation that was going through a particularly tough patch. … I interviewed the corporation’s top executives to gather their views on the state of their industry and their organization. After consolidating all the facts they’d given me, I sat down with the CEO in a plush Manhattan executive suite and asked him whether he wanted me to write the ‘Golden Opportunity’ story or the ‘Burning Platform’ story of his business.

These two phrases, “Golden Opportunity” and “Burning Platform,” describe two different approaches to telling the same story, or in this case promoting the same plan. The first describes the incredible potential the client company can realize by transforming itself to meet growing demand. The profit is out there for them if they work together to make the necessary changes! The second phrase refers to internal struggles at the company and a potential downward spiral that can only be arrested if the company transforms itself to correct the problems. Both stories are true and both are intended to create the same outcome: supporting a painful and difficult transformation. Yet they can create very different impressions in the minds of employees.

MacDonald illustrates how when we interact with someone, especially someone who knows more than we do, they have an opportunity to shape our reality. That is, they can shape how we think, our ideas and opinions about a subject. Our perception of reality changes and “because we act on the basis of our perceptions” they change not only our thinking but our actions.

Spin Masters

I remember watching ads on TV when I was a kid claiming that 80 percent of dentists recommended Colgate-Palmolive. I wondered if my mom was trying to kill me by giving me Crest. I wasn’t the best in math, but I reasoned that if 80% of dentists were recommending Colgate, at most 20% were recommending Crest.

Of course, that’s exactly what Colgate wanted people to think—the survey was in comparison to other brands. But that wasn’t the whole story. The survey actually asked dentists which brands they would recommend, and almost all of them listed several. Colgate wasn’t lying—but they were using a very distorted version of the truth, designed to mislead. The Advertising Standards Authority eventually banned the ad.

People use this sort of spin all the time. Everyone has an agenda. You can deceive without ever lying. Politicians get elected on how effective they are at “spinning truths in a way that create a false impression.” It’s only too easy for political agendas to trump impartial truth.

The Three Types of Communicators

“It’s not simply that we’re being lied to; the more insidious problem is that we are routinely misled by the truth.”

In Truth, Macdonald explores the effects of three types of communicators: advocates, misinformers, and misleaders.

Advocates select competing truths that create a reasonably accurate impression of reality in order to achieve a constructive goal.

Misinformers innocently propagate competing truths that unintentionally distort reality.

Misleaders deliberately deploy competing truths to create an impression of reality that they know is not true.

We may feel better believing there is one single truth, and thinking everyone who doesn’t see things the way we do simply doesn’t have the truth. That’s not…true. Everyone, including you and me, has a lens on the situation that’s distorted by what they want, how they see the world, and their biases. The most dangerous truths are the credible ones that we convince ourselves are correct.

One idea I find helpful when faced with a situation is perspective-taking. I construct a mental room that I fill with all the participants and stakeholders around a table. I then put myself into their seats and try to see the room through their eyes. Not only does this help me better understand reality by showing me my blind spots, but it shows me what other people care about and how I can create win-wins.

Truth: How the Many Sides to Every Story Shape Our Reality, goes on to explore partial truths, subjective truths, artificial truths, and unknown truths. It’s a terrific read for checking your own perspective on truth, and understanding how truth can be used to both inform and mislead you.

The Value of Probabilistic Thinking: Spies, Crime, and Lightning Strikes

Probabilistic Thinking (c) 2018 Farnam Street Media Inc. All rights reserved. May not be used without written permission.

Probabilistic thinking is essentially trying to estimate, using some tools of math and logic, the likelihood of any specific outcome coming to pass. It is one of the best tools we have to improve the accuracy of our decisions. In a world where each moment is determined by an infinitely complex set of factors, probabilistic thinking helps us identify the most likely outcomes. When we know these our decisions can be more precise and effective.

Are you going to get hit by lightning or not?

Why we need the concept of probabilities at all is worth thinking about. Things either are or are not, right? We either will get hit by lightning today or we won’t. The problem is, we just don’t know until we live out the day, which doesn’t help us at all when we make our decisions in the morning. The future is far from determined and we can better navigate it by understanding the likelihood of events that could impact us.

Our lack of perfect information about the world gives rise to all of probability theory, and its usefulness. We know now that the future is inherently unpredictable because not all variables can be known and even the smallest error imaginable in our data very quickly throws off our predictions. The best we can do is estimate the future by generating realistic, useful probabilities. So how do we do that?

Probability is everywhere, down to the very bones of the world. The probabilistic machinery in our minds—the cut-to-the-quick heuristics made so famous by the psychologists Daniel Kahneman and Amos Tversky—was evolved by the human species in a time before computers, factories, traffic, middle managers, and the stock market. It served us in a time when human life was about survival, and still serves us well in that capacity.

But what about today—a time when, for most of us, survival is not so much the issue? We want to thrive. We want to compete, and win. Mostly, we want to make good decisions in complex social systems that were not part of the world in which our brains evolved their (quite rational) heuristics.

For this, we need to consciously add in a needed layer of probability awareness. What is it and how can I use it to my advantage?

There are three important aspects of probability that we need to explain so you can integrate them into your thinking to get into the ballpark and improve your chances of catching the ball:

  1. Bayesian thinking,
  2. Fat-tailed curves
  3. Asymmetries

Thomas Bayes and Bayesian thinking: Bayes was an English minister in the first half of the 18th century, whose most famous work, “An Essay Toward Solving a Problem in the Doctrine of Chances” was brought to the attention of the Royal Society by his friend Richard Price in 1763—two years after his death. The essay, the key to what we now know as Bayes’s Theorem, concerned how we should adjust probabilities when we encounter new data.

The core of Bayesian thinking (or Bayesian updating, as it can be called) is this: given that we have limited but useful information about the world, and are constantly encountering new information, we should probably take into account what we already know when we learn something new. As much of it as possible. Bayesian thinking allows us to use all relevant prior information in making decisions. Statisticians might call it a base rate, taking in outside information about past situations like the one you’re in.

Consider the headline “Violent Stabbings on the Rise.” Without Bayesian thinking, you might become genuinely afraid because your chances of being a victim of assault or murder is higher than it was a few months ago. But a Bayesian approach will have you putting this information into the context of what you already know about violent crime.

You know that violent crime has been declining to its lowest rates in decades. Your city is safer now than it has been since this measurement started. Let’s say your chance of being a victim of a stabbing last year was one in 10,000, or 0.01%. The article states, with accuracy, that violent crime has doubled. It is now two in 10,000, or 0.02%. Is that worth being terribly worried about? The prior information here is key. When we factor it in, we realize that our safety has not really been compromised.

Conversely, if we look at the diabetes statistics in the United States, our application of prior knowledge would lead us to a different conclusion. Here, a Bayesian analysis indicates you should be concerned. In 1958, 0.93% of the population was diagnosed with diabetes. In 2015 it was 7.4%. When you look at the intervening years, the climb in diabetes diagnosis is steady, not a spike. So the prior relevant data, or priors, indicate a trend that is worrisome.

It is important to remember that priors themselves are probability estimates. For each bit of prior knowledge, you are not putting it in a binary structure, saying it is true or not. You’re assigning it a probability of being true. Therefore, you can’t let your priors get in the way of processing new knowledge. In Bayesian terms, this is called the likelihood ratio or the Bayes factor. Any new information you encounter that challenges a prior simply means that the probability of that prior being true may be reduced. Eventually, some priors are replaced completely. This is an ongoing cycle of challenging and validating what you believe you know. When making uncertain decisions, it’s nearly always a mistake not to ask: What are the relevant priors? What might I already know that I can use to better understand the reality of the situation?

Now we need to look at fat-tailed curves: Many of us are familiar with the bell curve, that nice, symmetrical wave that captures the relative frequency of so many things from height to exam scores. The bell curve is great because it’s easy to understand and easy to use. Its technical name is “normal distribution.” If we know we are in a bell curve situation, we can quickly identify our parameters and plan for the most likely outcomes.

Fat-tailed curves are different. Take a look.

(c) 2018 Farnam Street Media Inc. All rights reserved. May not be used without written permission.

At first glance they seem similar enough. Common outcomes cluster together, creating a wave. The difference is in the tails. In a bell curve the extremes are predictable. There can only be so much deviation from the mean. In a fat-tailed curve there is no real cap on extreme events.

The more extreme events that are possible, the longer the tails of the curve get. Any one extreme event is still unlikely, but the sheer number of options means that we can’t rely on the most common outcomes as representing the average. The more extreme events that are possible, the higher the probability that one of them will occur. Crazy things are definitely going to happen, and we have no way of identifying when.

Think of it this way. In a bell curve type of situation, like displaying the distribution of height or weight in a human population, there are outliers on the spectrum of possibility, but the outliers have a fairly well defined scope. You’ll never meet a man who is ten times the size of an average man. But in a curve with fat tails, like wealth, the central tendency does not work the same way. You may regularly meet people who are ten, 100, or 10,000 times wealthier than the average person. That is a very different type of world.

Let’s re-approach the example of the risks of violence we discussed in relation to Bayesian thinking. Suppose you hear that you had a greater risk of slipping on the stairs and cracking your head open than being killed by a terrorist. The statistics, the priors, seem to back it up: 1,000 people slipped on the stairs and died last year in your country and only 500 died of terrorism. Should you be more worried about stairs or terror events?

Some use examples like these to prove that terror risk is low—since the recent past shows very few deaths, why worry?[1] The problem is in the fat tails: The risk of terror violence is more like wealth, while stair-slipping deaths are more like height and weight. In the next ten years, how many events are possible? How fat is the tail?

The important thing is not to sit down and imagine every possible scenario in the tail (by definition, it is impossible) but to deal with fat-tailed domains in the correct way: by positioning ourselves to survive or even benefit from the wildly unpredictable future, by being the only ones thinking correctly and planning for a world we don’t fully understand.

Asymmetries: Finally, you need to think about something we might call “metaprobability” —the probability that your probability estimates themselves are any good.

This massively misunderstood concept has to do with asymmetries. If you look at nicely polished stock pitches made by professional investors, nearly every time an idea is presented, the investor looks their audience in the eye and states they think they’re going to achieve a rate of return of 20% to 40% per annum, if not higher. Yet exceedingly few of them ever attain that mark, and it’s not because they don’t have any winners. It’s because they get so many so wrong. They consistently overestimate their confidence in their probabilistic estimates. (For reference, the general stock market has returned no more than 7% to 8% per annum in the United States over a long period, before fees.)

Another common asymmetry is people’s ability to estimate the effect of traffic on travel time. How often do you leave “on time” and arrive 20% early? Almost never? How often do you leave “on time” and arrive 20% late? All the time? Exactly. Your estimation errors are asymmetric, skewing in a single direction. This is often the case with probabilistic decision-making.[2]

Far more probability estimates are wrong on the “over-optimistic” side than the “under-optimistic” side. You’ll rarely read about an investor who aimed for 25% annual return rates who subsequently earned 40% over a long period of time. You can throw a dart at the Wall Street Journal and hit the names of lots of investors who aim for 25% per annum with each investment and end up closer to 10%.

The spy world

Successful spies are very good at probabilistic thinking. High-stakes survival situations tend to make us evaluate our environment with as little bias as possible.

When Vera Atkins was second in command of the French unit of the Special Operations Executive (SOE), a British intelligence organization reporting directly to Winston Churchill during World War II[3], she had to make hundreds of decisions by figuring out the probable accuracy of inherently unreliable information.

Atkins was responsible for the recruitment and deployment of British agents into occupied France. She had to decide who could do the job, and where the best sources of intelligence were. These were literal life-and-death decisions, and all were based in probabilistic thinking.

First, how do you choose a spy? Not everyone can go undercover in high-stress situations and make the contacts necessary to gather intelligence. The result of failure in France in WWII was not getting fired; it was death. What factors of personality and experience show that a person is right for the job? Even today, with advancements in psychology, interrogation, and polygraphs, it’s still a judgment call.

For Vera Atkins in the 1940s, it was very much a process of assigning weight to the various factors and coming up with a probabilistic assessment of who had a decent chance of success. Who spoke French? Who had the confidence? Who was too tied to family? Who had the problem-solving capabilities? From recruitment to deployment, her development of each spy was a series of continually updated, educated estimates.

Getting an intelligence officer ready to go is only half the battle. Where do you send them? If your information was so great that you knew exactly where to go, you probably wouldn’t need an intelligence mission. Choosing a target is another exercise in probabilistic thinking. You need to evaluate the reliability of the information you have and the networks you have set up. Intelligence is not evidence. There is no chain of command or guarantee of authenticity.

The stuff coming out of German-occupied France was at the level of grainy photographs, handwritten notes that passed through many hands on the way back to HQ, and unverifiable wireless messages sent quickly, sometimes sporadically, and with the operator under incredible stress. When deciding what to use, Atkins had to consider the relevancy, quality, and timeliness of the information she had.

She also had to make decisions based not only on what had happened, but what possibly could. Trying to prepare for every eventuality means that spies would never leave home, but they must somehow prepare for a good deal of the unexpected. After all, their jobs are often executed in highly volatile, dynamic environments. The women and men Atkins sent over to France worked in three primary occupations: organizers were responsible for recruiting locals, developing the network, and identifying sabotage targets; couriers moved information all around the country, connecting people and networks to coordinate activities; and wireless operators had to set up heavy communications equipment, disguise it, get information out of the country, and be ready to move at a moment’s notice. All of these jobs were dangerous. The full scope of the threats was never completely identifiable. There were so many things that could go wrong, so many possibilities for discovery or betrayal, that it was impossible to plan for them all. The average life expectancy in France for one of Atkins’ wireless operators was six weeks.

Finally, the numbers suggest an asymmetry in the estimation of the probability of success of each individual agent. Of the 400 agents that Atkins sent over to France, 100 were captured and killed. This is not meant to pass judgment on her skills or smarts. Probabilistic thinking can only get you in the ballpark. It doesn’t guarantee 100% success.

There is no doubt that Atkins relied heavily on probabilistic thinking to guide her decisions in the challenging quest to disrupt German operations in France during World War II. It is hard to evaluate the success of an espionage career, because it is a job that comes with a lot of loss. Atkins was extremely successful in that her network conducted valuable sabotage to support the allied cause during the war, but the loss of life was significant.

Conclusion

Successfully thinking in shades of probability means roughly identifying what matters, coming up with a sense of the odds, doing a check on our assumptions, and then making a decision. We can act with a higher level of certainty in complex, unpredictable situations. We can never know the future with exact precision. Probabilistic thinking is an extremely useful tool to evaluate how the world will most likely look so that we can effectively strategize.

Members can discuss this post on the Learning Community Forum

References:

[1] Taleb, Nassim Nicholas. Antifragile. New York: Random House, 2012.

[2] Bernstein, Peter L. Against the Gods: The Remarkable Story of Risk. New York: John Wiley and Sons, 1996. (This book includes an excellent discussion in Chapter 13 on the idea of the scope of events in the past as relevant to figuring out the probability of events in the future, drawing on the work of Frank Knight and John Maynard Keynes.)

[3] Helm, Sarah. A Life in Secrets: The Story of Vera Atkins and the Lost Agents of SOE. London: Abacus, 2005.

Bayes and Deadweight: Using Statistics to Eject the Deadweight From Your Life

“[K]nowledge is indeed highly subjective, but we can quantify it with a bet. The amount we wager shows how much we believe in something.”

— Sharon Bertsch McGrayne

The quality of your life will, to a large extent, be decided by whom you elect to spend your time with. Supportive, caring, and funny are great attributes in friends and lovers. Unceasingly negative cynics who chip away at your self-esteem? We need to jettison those people as far and fast as we can.

The problem is, how do we identify these people who add nothing positive — or not enough positive — to our lives?

Few of us keep relationships with obvious assholes. There are always a few painfully terrible family members we have to put up with at weddings and funerals, but normally we choose whom we spend time with. And we’ve chosen these people because, at some point, our interactions with them felt good.

How, then, do we identify the deadweight? The people who are really dragging us down and who have a high probability of continuing to do so in the future? We can apply the general thinking tool called Bayesian Updating.

Bayes’s theorem can involve some complicated mathematics, but at its core lies a very simple premise. Probability estimates should start with what we already know about the world and then be incrementally updated as new information becomes available. Bayes can even help us when that information is relevant but subjective.

How? As McGrayne explains in the quote above, from The Theory That Would Not Die, you simply ask yourself to wager on the outcome.

Let’s take an easy example.

You are going on a blind date. You’ve been told all sorts of good things in advance — the person is attractive and funny and has a good job — so of course, you are excited. The date starts off great, living up to expectations. Halfway through you find out they have a cat. You hate cats. Given how well everything else is going, how much should this information affect your decision to keep dating?

Quantify your belief in the most probable outcome with a bet. How much would you wager that harmony on the pet issue is an accurate predictor of relationship success? Ten cents? Ten thousand dollars? Do the thought experiment. Imagine walking into a casino and placing a bet on the likelihood that this person’s having a cat will ultimately destroy the relationship. How much money would you take out of your savings and lay on the table? Your answer will give you an idea of how much to factor the cat into your decision-making process. If you wouldn’t part with a dime, then I wouldn’t worry about it.

This kind of approach can help us when it comes to evaluating our interpersonal relationships. Deciding if someone is a good friend, partner, or co-worker is full of subjective judgments. There is usually some contradictory information, and ultimately no one is perfect. So how do you decide who is worth keeping around?

Let’s start with friends. The longer a friendship lasts, the more likely it is to have ups and downs. The trick is to start quantifying these. A hit from a change in geographical proximity is radically different from a hit from betrayal — we need to factor these differences into our friendship formula.

This may seem obvious, but the truth is that we often give the same weight to a wide variety of behaviors. We’ll say things like “yeah, she talked about my health problems when I asked her not to, but she always remembers my birthday.” By treating all aspects of the friendship equally, we have a hard time making reasonable estimates about the future value of that friendship. And that’s how we end up with deadweight.

For the friend who has betrayed your confidence, what you really want to know is the likelihood that she’s going to do it again. Instead of trying to remember and analyze every interaction you’ve ever had, just imagine yourself betting on it. Go back to that casino and head to the friendship roulette wheel. Where would you put your money? All in on “She can’t keep her mouth shut” or a few chips on “Not likely to happen again”?

Using a rough Bayesian model in our heads, we’re forcing ourselves to quantify what “good” is and what “bad” is. How good? How bad? How likely? How unlikely? Until we do some (rough) guessing at these things, we’re making decisions much more poorly than we need to be.

The great thing about using Bayes’s theorem is that it encourages constant updating. It also encourages an open mind by giving us the chance to look at a situation from multiple angles. Maybe she really is sorry about the betrayal. Maybe she thought she was acting in your best interests. There are many possible explanations for her behavior and you can use Bayes’s theorem to integrate all of her later actions into your bet. If you find yourself reducing the amount of money you’d bet on further betrayal, you can accurately assume that the probability she will betray your trust again has gone down.

Using this strategy can also stop the endless rounds of asking why. Why did that co-worker steal my idea? Who else do I have to watch out for? This what-if thinking is paralyzing. You end up self-justifying your behavior by anticipating the worst possible scenarios you can imagine. Thus, you don’t change anything, and you step further away from a solution.

In reality, who cares? The why isn’t important; the most relevant task for you is to figure out the probability that your coworker will do it again. Don’t spend hours analyzing what to do, get upset over the doomsday scenarios you have come up with, or let a few glasses of wine soften the experience.

Head to your mental casino and place the bet, quantifying all the subjective information in your head that is messy and hard to articulate. You will cut through the endless “but maybes” and have a clear path forward that addresses the probable future. It may make sense to give him the benefit of the doubt. It may also be reasonable to avoid him as much as possible. When you figure out how much you would wager on the potential outcomes, you’ll know what to do.

Sometimes we can’t just get rid of people who aren’t good for us — family being the prime example. But you can also use Bayes to test how your actions will change the probability of outcomes to find ways of keeping the negativity minimal. Let’s say you have a cousin who always plans to visit but then cancels. You can’t stop being his cousin and saying “you aren’t welcome at my house” will cause a big family drama. So what else can you do?

Your initial equation — your probability estimate — indicates that the behavior is likely to continue. In your casino, you would comfortably bet your life savings that it will happen again. Now imagine ways in which you could change your behavior. Which of these would reduce your bet? You could have an honest conversation with him, telling him how his actions make you feel. To know if he’s able to openly receive this, consider whether your bet would change. Or would you wager significantly less after employing the strategy of always being busy when he calls to set up future visits?

And you can dig even deeper. Which of your behaviors would increase the probability that he actually comes? Which behaviors would increase the probability that he doesn’t bother making plans in the first place? Depending on how much you like him, you can steer your changes to the outcome you’d prefer.

Quantifying the subjective and using Bayes’s theorem can help us clear out some of the relationship negativity in our lives.

Mental Model: Misconceptions of Chance

Misconceptions of Chance

We expect the immediate outcome of events to represent the broader outcomes expected from a large number of trials. We believe that chance events will immediately self-correct and that small sample sizes are representative of the populations from which they are drawn. All of these beliefs lead us astray.

***

 

Our understanding of the world around us is imperfect and when dealing with chance our brains tend to come up with ways to cope with the unpredictable nature of our world.

“We tend,” writes Peter Bevelin in Seeking Wisdom, “to believe that the probability of an independent event is lowered when it has happened recently or that the probability is increased when it hasn’t happened recently.”

In short, we believe an outcome is due and that chance will self-correct.

The problem with this view is that nature doesn’t have a sense of fairness or memory. We only fool ourselves when we mistakenly believe that independent events offer influence or meaningful predictive power over future events.

Furthermore we also mistakenly believe that we can control chance events. This applies to risky or uncertain events.

Chance events coupled with positive reinforcement or negative reinforcement can be a dangerous thing. Sometimes we become optimistic and think our luck will change and sometimes we become overly pessimistic or risk-averse.

How do you know if you’re dealing with chance? A good heuristic is to ask yourself if you can lose on purpose. If you can’t you’re likely far into the chance side of the skill vs. luck continuum. No matter how hard you practice, the probability of chance events won’t change.

“We tend,” writes Nassim Taleb in The Black Swan, “to underestimate the role of luck in life in general (and) overestimate it in games of chance.”

We are only discussing independent events. If events are dependent, where the outcome depends on the outcome of some other event, all bets are off.

 

***

Misconceptions of Chance

Daniel Kahneman coined the term misconceptions of chance to describe the phenomenon of people extrapolating large-scale patterns to samples of a much smaller size. Our trouble navigating the sometimes counterintuitive laws of probability, randomness, and statistics leads to misconceptions of chance.

Kahneman found that “people expect that a sequence of events generated by a random process will represent the essential characteristics of that process even when the sequence is short.”

In the paper Belief in the Law of Small Numbers, Kahneman and Tversky reflect on the results of an experiment, where subjects were instructed to generate a random sequence of hypothetical tosses of a fair coin.

They [the subjects] produce sequences where the proportion of heads in any short segment stays far closer to .50 than the laws of chance would predict. Thus, each segment of the response sequence is highly representative of the “fairness” of the coin.

Unsurprisingly, the same nature of errors occurred when the subjects, instead of being asked to generate sequences themselves, were simply asked to distinguish between random and human generated sequences. It turns out that when considering tosses of a coin for heads or tails people regard the sequence H-T-H-T-T-H to be more likely than the sequence H-H-H-T-H-T, which does not appear random, and also more likely than the sequence H-H-H-H-T-H. In reality, each one of those sequences has the exact same probability of occurring. This is a misconception of chance.

The aspect that most of us find so hard to grasp about this case is that any pattern of the same length is just as likely to occur in a random sequence. For example, the odds of getting 5 tails in a row are 0.03125 or simply stated 0.5 (the odds of a specific outcome at each trial) to the power of 5 (number of trials).

The same probability rule applies for getting the specific sequences of HHTHT or THTHT – where each sequence is obtained by once again taking 0.5 (the odds of a specific outcome at each trial) to the power of 5 (number of trials), which equals 0.03125.

This probability is true for sequences – but it implies no relation between the odds of a specific outcome at each trial and the representation of the true proportion within these short sequences.

Yet it’s still surprising. This is because people expect that the single event odds will be reflected not only in the proportion of events as a whole but also in the specific short sequences we encounter. But this is not the case. A perfectly alternating sequence is just as extraordinary as a sequence with all tails or all heads.

In comparison, “a locally representative sequence,” Kahneman writes, in Thinking, Fast and Slow, “deviates systematically from chance expectation: it contains too many alternations and too few runs. Another consequence of the belief in local representativeness is the well-known gambler’s fallacy.”

***

Gambler’s Fallacy

There is a specific variation of the misconceptions of chance that Kahneman calls the Gambler’s fallacy (elsewhere also called the Monte Carlo fallacy).

The gambler’s fallacy implies that when we come across a local imbalance, we expect that the future events will smoothen it out. We will act as if every segment of the random sequence must reflect the true proportion and, if the sequence has deviated from the population proportion, we expect the imbalance to soon be corrected.

Kahneman explains that this is unreasonable – coins, unlike people, have no sense of equality and proportion:

The heart of the gambler’s fallacy is a misconception of the fairness of the laws of chance. The gambler feels that the fairness of the coin entitles him to expect that any deviation in one direction will soon be cancelled by a corresponding deviation in the other. Even the fairest of coins, however, given the limitations of its memory and moral sense, cannot be as fair as the gambler expects it to be.

He illustrates this with an example of the roulette wheel and our expectations when a reasonably long sequence of repetition occurs.

After observing a long run of red on the roulette wheel, most people erroneously believe that black is now due, presumably because the occurrence of black will result in a more representative sequence than the occurrence of an additional red.

In reality, of course, roulette is a random, non-evolving process, in which the chance of getting a red or a black will never depend on the past sequence. The probabilities restore after each run, yet we still seem to take the past moves into account.

Contrary to our expectations, the universe does not keep accounting of a random process so streaks are not necessarily tilted towards the true proportion. Your chance of getting a red after a series of blacks will always be equal to that of getting another red as long as the wheel is fair.

The gambler’s fallacy need not to be committed inside the casino only. Many of us commit it frequently by thinking that a small, random sample will tend to correct itself.

For example, assume that the average IQ at a specific country is known to be 100. And for the purposes of assessing intelligence at a specific district, we draw a random sample of 50 persons. The first person in our sample happens to have an IQ of 150. What would you expect the mean IQ to be for the whole sample?

The correct answer is (100*49 + 150*1)/50 = 101. Yet without knowing the correct answer, it is tempting to say it is still 100 – the same as in the country as a whole.

According to Kahneman and Tversky such expectation could only be justified by the belief that a random process is self-correcting and that the sample variation is always proportional. They explain:

Idioms such as “errors cancel each other out” reflect the image of an active self-correcting process. Some familiar processes in nature obey such laws: a deviation from a stable equilibrium produces a force that restores the equilibrium.

Indeed, this may be true in thermodynamics, chemistry and arguably also economics. These, however, are false analogies. It is important to realize that the laws governed by chance are not guided by principles of equilibrium and the number of random outcomes in a sequence do not have a common balance.

“Chance,” Kahneman writes in Thinking, Fast and Slow, “is commonly viewed as a self-correcting process in which a deviation in one direction induces a deviation in the opposite direction to restore the equilibrium. In fact, deviations are not “corrected” as a chance process unfolds, they are merely diluted.”

 

***

The Law of Small Numbers

Misconceptions of chance are not limited to gambling. In fact, most of us fall for them all the time because we intuitively believe (and there is a whole best-seller section at the bookstore to prove) that inferences drawn from small sample sizes are highly representative of the populations from which they are drawn.

By illustrating people’s expectations of random heads and tails sequences, we already established that we have preconceived notions of what randomness looks like. This, coupled with the unfortunate tendency to believe in self-correcting process in a random sample, generates expectations about sample characteristics and representativeness, which are not necessarily true. The expectation that the patterns and characteristics within a small sample will be representative of the population as a whole is called the law of small numbers.

Consider the sequence:

1, 2, 3, _, _, _

What do you think are the next three digits?

The task almost seems laughable, because the pattern is so familiar and obvious – 4,5,6. However, there is an endless variation of different algorithms that would still fit the first three numbers, such as the Fibonacci sequence (5, 8, 13), a repeated sequence (1,2,3), a random sequence (5,8,2) and many others. Truth is, in this case there simply is not enough information to say what the rules governing this specific sequence are with any reliability.

The same rule applies to sampling problems – sometimes we feel we have gathered enough data to tell a real pattern from an illusion. Let me illustrate this fallacy with yet another example.

Imagine that you face a tough decision between investing in the development of two different product opportunities. Let’s call them Product A or Product B. You are interested in which product would appeal to the majority of the market, so you decide to conduct customer interviews. Out of the first five pilot interviews, four customers show a preference for Product A. While the sample size is quite small, given the time pressure involved, many of us would already have some confidence in concluding that the majority of customers would prefer Product A.

However, a quick statistical test will tell you that the probability of a result just as extreme is in fact 3/8, assuming that there is no preference among customers at all. This in simple terms means that if customers had no preference between Products A and B, you would still expect 3 customer samples out of 8 to have four customers vouching for Product A.

Basically, a study of such size has little to no predictive validity – these results could easily be obtained from a population with no preference for one or the other product. This, of course, does not mean that talking to customers is of no value. Quite the contrary – the more random cases we examine, the more reliable and accurate the results of the true proportion will be. If we want absolute certainty we must be prepared for a lot of work.

There will always be cases where a guesstimate based on a small sample will be enough because we have other critical information guiding the decision-making process or we simply do not need a high degree of confidence. Yet rather than assuming that the samples we come across are always perfectly representative, we must treat random selection with the suspicion it deserves. Accepting the role imperfect information and randomness play in our lives and being actively aware of what we don’t know already makes us better decision makers.

13 Practical Ideas That Have Helped Me Make Better Decisions

This article is a collaboration between Mark Steed and myself. He did most of the work. Mark was a participant at the last Re:Think Decision Making event as well as a member of the Good Judgment Project. I asked him to put together something on making better predictions. This is the result.

We all face decisions. Sometimes we think hard about a specific decision, other times, we make decisions without thinking. If you’ve studied the genre you’ve probably read Taleb, Tversky, Kahneman, Gladwell, Ariely, Munger, Tetlock, Mauboussin and/or Thaler. These pioneers write a lot about “rationality” and “biases”.

Rationality dictates the selection of the best choice among however many options. Biases of a cognitive or emotional nature creep in and are capable of preventing the identification of the “rational” choice. These biases can exist in our DNA or can be formed through life experiences. The mentioned authors consider biases extensively, and, lucky for us, their writings are eye-opening and entertaining.

Rather than rehash what brighter minds have discussed, I’ll focus on practical ideas that have helped me make better decisions. I think of this as a list of “lessons learned (so far)” from my work in asset management and as a forecaster for the Good Judgment Project. I’ve held back on submitting this given the breadth and depth of the FS readers, but, rather than expect perfection, I wanted to put something on the table because I suspect many of you have useful ideas that will help move the conversation forward.

1. This is a messy business. Studying decision science can easily motivate self-loathing. There are over one-hundred cognitive biases that might prevent us from making calculated and “rational” decisions. What, you can’t create a decision tree with 124 decision nodes, complete with assorted probabilities in split seconds? I asked around, and it turns out, not many people can. Since there is no way to eliminate all the potential cognitive biases and I don’t possess the mental faculties of Dr. Spock or C-3PO, I might as well live with the fact that some decisions will be more elegant than others.

2. We live and work in dynamic environments. Dynamic environments adapt. The opposite of dynamic environments are static environments. Financial markets, geopolitical events, team sports, etc. are examples of dynamic “environments” because relationships between agents evolve and problems are often unpredictable. Changes from one period are conditional on what happened the previous period. Casinos are more representative of static environments. Not casinos necessarily, but the games inside. If you play Roulette, your odds of winning are always the same and it doesn’t matter what happened the previous turn.

3. Good explanatory models are not necessarily good predictive models. Dynamic environments have a habit of desecrating rigid models. While blindly following an elegant model may be ill-advised, strong explanatory models are excellent guideposts when paired with sound judgment and intuition. Just as I’m not comfortable with the automatic pilot flying a plane without a human in the cockpit, I’m also not comfortable with a human flying a plane without the help of technology. It has been said before, people make models better and models make people better.

4. Instinct is not always irrational.  The rule of thumb, otherwise known as heuristics, provide better results than more complicated analytical techniques. Gerd Gigerenzer, is the thought leader and his book Risk Savvy: How to Make Good Decisions is worth reading. Most literature despises heuristics, but he asserts intuition proves superior because optimization is sometimes mathematically impossible or exposed to sampling error. He often uses the example of Harry Markowitz, who won a Nobel Prize in Economics in 1990 for his work on Modern Portfolio Theory. Markowitz discovered a method for determining the “optimal” mix of assets. However, Markowitz himself did not follow his Nobel prize-winning mean-variance theory but instead used a 1/N heuristic by spreading his dollars equally across N number of investments. He concluded that his 1/N strategy would perform better than a mean-optimization application unless the mean-optimization model had 500 years to compete.  Our intuition is more likely to be accurate if it is preceded by rigorous analysis and introspection. And simple rules are more effective at communicating winning strategies in complex environments. When coaching a child’s soccer team, it is far easier teaching a few basic principles, than articulating the nuances of every possible situation.

5. Decisions are not evaluated in ways that help us reduce mistakes in the future. Our tendency is to only critique decisions where the desired outcome was not achieved while uncritically accepting positive outcomes even if luck, or another factor, produced the desired result. At the end of the day I understand all we care about are results, but good processes are more indicative of future success than good results.

6. Success is ill-defined. In some cases this is relatively straightforward. If the outcome is binary, either it did, or did not happen, success is easy to identify. But this is more difficult in situations where the outcome can take a range of potential values, or when individuals differ on what the values should be.

7. We should care a lot more about calibration. Confidence, not just a decision, should be recorded (and to be clear, decisions should be recorded). Next time you have a major decision, ask yourself how confident you are that the desired outcome will be achieved. Are you 50% confident? 90%? Write it down. This helps with calibration. For all decisions in which you are 50% confident, half should be successes. And you should be right nine out of ten times for all decisions in which you are 90% confident. If you are 100% confident, you should never be wrong. If you don’t know anything about a specific subject then you should be no more confident than a coin flip. It’s amazing how we will assign high confidence to an event we know nothing about. Turns out this idea is pretty helpful. Let’s say someone brings an idea to you and you know nothing about it. Your default should be 50/50; you might as well flip a coin. Then you just need to worry about the costs/payouts.

8. Probabilities are one thing, payouts are another. You might feel 50/50 about your chances but you need to know your payouts if you are right. This is where the expected value comes in handy. It’s the probability of being right multiplied by the payout if you are right, plus the probability of being wrong multiplied by the cost. E= .50(x) + .50(y). Say someone on your team has an idea for a project and you decided there is a 50% chance that it succeeds and, if it does, you double your money, if it doesn’t, you lose what you invested. If the project required $10mm, then the expected outcome is calculated as .50*20 + .50*0 = 10, or $10mm. If you repeat this process a number of times, approving only projects with a 2:1 payout and 50% probability of success you would likely end up with the same amount you started with. Binary outcomes that have a 50/50 probability should have a double-or-nothing payout. This is even more helpful given #7 above. If you were tracking this employee’s calibration you would have a sense as to whether their forecasts are accurate. As a team member or manager, you would want to know if a specific employee is 90% confident all the time but only 50% accurate. More importantly, you would want to know if a certain team member is usually right when they express 90% or 100% confidence. Use a Brier Score to track colleagues but provide an environment to encourage discussion and openness.

9. We really are overconfident. Starting from the assumption that we are probably only 50% accurate is not a bad idea. Phil Tetlock, a professor at UPenn, Team Leader for the Good Judgment Project and author of Expert Political Judgment: How Good Is It? How Can We Know?, suggested political pundits are about 53% accurate regarding political forecasts while CXO Advisory tracks investment gurus and finds they are, in aggregate, about 48% accurate. These are experts making predictions about their core area of expertise. Consider the rate of divorce in the U.S., currently around 40%-50%, as additional evidence that sometimes we don’t know as much as we think. Experts are helpful in explaining a specific discipline but they are less helpful in dynamic environments. If you need something fixed, like a car, a clock or an appliance then experts can be very helpful. Same for tax and accounting advice. It’s not because this stuff is simple, it’s because the environment is static.

10. Improving estimations of probabilities and payouts is about polishing our 1) subject matter expertise and 2) cognitive processing abilities. Learning more about a given subject reduces uncertainty and allows us to move from the lazy 50/50 forecast. Say you travel to Arizona and get stung by a scorpion. Rather than assume a 50% probability of death you can do a quick internet search and learn no one has died from a scorpion bite in Arizona since the 1960s. Overly simplistic, but, you get the picture. Second, data needs to be interpreted in a cogent way. Let’s say you work in asset management and one of your portfolio managers has made three investments that returned -5%, -12% and 22%. What can you say about the manager (other than two of the three investments lost money)? Does the information allow you to claim the portfolio manager is a bad manager? Does the information allow you to claim you can confidently predict his/her average rate of return? Unless you’ve had some statistics, it might not be entirely clear what clinical conclusions you can draw. What if you flipped a coin three times and came up with tails on two of them? That wouldn’t seem so strange. Two-thirds is the same as 66%. If you tossed the coin one-hundred times and got 66 tails, that would be a little more interesting. The more observations, the higher our confidence should be. A 95% confidence interval for the portfolio manager’s average return would be a range between -43% and 45%. Is that enough to take action?

11. Bayesian analysis is more useful than we think. Bayesian updating helps direct given false/true positives and false/true negatives. It’s the probability of a hypothesis given some observed data. For example, what’s the likelihood of X (this new hire will place in the top 10% of the firm) given Y (they graduated from an Ivy League school)? A certain percentage of employees are top-performing employees, some Ivy League grads will be top-performers (others not) and some non-Ivy League grads will be top-performers (others not). If I’m staring at a random employee trying to guess whether they are a top-performing employee all I have are the starting odds, and, if only the top 10% qualify, I know my chances are 1 in 10. But I can update my odds if supplied information regarding their education. Here’s another example. What is the likelihood a project will be successful (X) given it missed one of the first two milestones (Y)?. There are lots of helpful resources online if you want to learn more but think of it this way (hat tip to Kalid Azad at Better Explained); original odds x the evidence adjustment = your new odds. The actual equation is more complicated but that is the intuition behind it. Bayesian analysis has its naysayers. In the examples provided, the prior odds of success are known, or could easily be obtained, but this isn’t always true. Most of the time subjective prior probabilities are required and this type of tomfoolery is generally discouraged. There are ways around that, but no time to explain it here.

12. A word about crowds. Is there a wisdom of crowds? Some say yes, others say no. My view is that crowds can be very useful if individual members of the crowd are able to vote independently or if the environment is such that there are few repercussions for voicing disagreement. Otherwise, I think signaling effects from seeing how others are “voting” is too much evolutionary force to overcome with sheer rational willpower. Our earliest ancestors ran when the rest of the tribe ran. Not doing so might have resulted in an untimely demise.

13. Analyze your own motives. Jonathan Haidt, author of The Righteous Mind: Why Good People Are Divided by Politics and Religion, is credited with teaching that logic isn’t used to find truth, it’s used to win arguments. Logic may not be the only source of truth (and I have no basis for that claim). Keep this in mind as it has to do with the role of intuition in decision making.

Just a few closing thoughts.

We are pretty hard on ourselves. My process is to make the best decisions I can, realizing not all of them will be optimal. I have a method to track my decisions and to score how accurate I am. Sometimes I use heuristics, but I try to keep those to within my area of competency, as Munger says. I don’t do lists of pros and cons because I feel like I’m just trying to convince myself, either way.

If I have to make a big decision, in an unfamiliar area, I try to learn as much as I can about the issue on my own and from experts, assess how much randomness could be present, formulate my thesis, look for contradictory information, try and build downside protection (risking as little as possible) and watch for signals that may indicate a likely outcome. Many of my decisions have not worked out, but most of them have. As the world changes, so will my process, and I look forward to that.

Have something to say? Become a member: join the slack conversation and chat with Mark directly.

The Lucretius Problem: How History Blinds Us

The Lucretius Problem is a mental defect where we assume the worst case event that has happened is the worst case event that can happen. In so doing, we fail to understand that the worst event that has happened in the past surpassed the worst event that came before it. Only the fool believes all he can see is all there is to see.

Lucretius_Rome

It’s always good to re-read books and to dip back into them periodically. When reading a new book, I often miss out on crucial information (especially books that are hard to categorize with one descriptive sentence). When you come back to a book after reading hundreds of others you can’t help but make new connections with the old book and see it anew. The book hasn’t changed but you have.

It has been a while since I read Anti-fragile. In the past, I’ve talked about an Antifragile Way of Life, Learning to Love Volatility, the Definition of Antifragility, and the Noise and the Signal.

But upon re-reading Antifragile I came across the Lucretius Problem and I thought I’d share an excerpt. (Titus Lucretius Carus was a Roman poet and philosopher, best-known for his poem On the Nature of Things).

In Antifragile, Nassim Taleb writes:

Indeed, our bodies discover probabilities in a very sophisticated manner and assess risks much better than our intellects do. To take one example, risk management professionals look in the past for information on the so-called worst-case scenario and use it to estimate future risks – this method is called “stress testing.” They take the worst historical recession, the worst war, the worst historical move in interest rates, or the worst point in unemployment as an exact estimate for the worst future outcome​. But they never notice the following inconsistency: this so-called worst-case event, when it happened, exceeded the worst [known] case at the time.

I have called this mental defect the Lucretius problem, after the Latin poetic philosopher who wrote that the fool believes that the tallest mountain in the world will be equal to the tallest one he has observed. We consider the biggest object of any kind that we have seen in our lives or hear about as the largest item that can possibly exist. And we have been doing this for millennia.

Taleb brings up an interesting point, which is that our documented history can blind us. All we know is what we have been able to record. There is an uncertainty that we don’t seem to grasp.

We think because we have sophisticated data collecting techniques that we can capture all the data necessary to make decisions. We think we can use our current statistical techniques to draw historical trends using historical data without acknowledging the fact that past data recorders had fewer tools to capture the dark figure of unreported data. We also overestimate the validity of what has been recorded before and thus the trends we draw might tell a different story if we had the dark figure of unreported data.

Taleb continues:

The same can be seen in the Fukushima nuclear reactor, which experienced a catastrophic failure in 2011 when a tsunami struck. It had been built to withstand the worst past historical earthquake, with the builders not imagining much worse— and not thinking that the worst past event had to be a surprise, as it had no precedent. Likewise, the former chairman of the Federal Reserve, Fragilista Doctor Alan Greenspan, in his apology to Congress offered the classic “It never happened before.” Well, nature, unlike Fragilista Greenspan, prepares for what has not happened before, assuming worse harm is possible.

Dealing with Uncertainty

Taleb provides an answer which is to develop layers of redundancy, that is a margin of safety, to act as a buffer against oneself. We overvalue what we have recorded and assume it tells us the worst and best possible outcomes. Redundant layers are a buffer against our tendency to think what has been recorded is a map of the whole terrain. An example of a redundant feature could be a rainy day fund which acts as an insurance policy against something catastrophic such as a job loss that allows you to survive and fight another day.

Antifragile is a great book to read and you might learn something about yourself and the world you live in by reading it or in my case re-reading it.


Read Next

Nassim Taleb: The Definition of a Black Swan