Tag: Mental Model

How to Use Occam’s Razor Without Getting Cut

Occam’s razor is one of the most useful, (yet misunderstood,) models in your mental toolbox to solve problems more quickly and efficiently. Here’s how to use it.

***

Occam’s razor (also known as the “law of parsimony”) is a problem-solving principle which serves as a useful mental model. A philosophical razor is a tool used to eliminate improbable options in a given situation. Occam’s is the best-known example.

Occam’s razor can be summarized as follows:

Among competing hypotheses, the one with the fewest assumptions should be selected.

The Basics

In simpler language, Occam’s razor states that the simplest explanation is preferable to one that is more complex. Simple theories are easier to verify. Simple solutions are easier to execute.

In other words, we should avoid looking for excessively complex solutions to a problem, and focus on what works given the circumstances. Occam’s razor can be used in a wide range of situations, as a means of making rapid decisions and establishing truths without empirical evidence. It works best as a mental model for making initial conclusions before the full scope of information can be obtained.

Science and math offer interesting lessons that demonstrate the value of simplicity. For example, the principle of minimum energy supports Occam’s razor. This facet of the second law of thermodynamics states that wherever possible, the use of energy is minimized. Physicists use Occam’s razor in the knowledge that they can rely on everything to use the minimum energy necessary to function. A ball at the top of a hill will roll down in order to be at the point of minimum potential energy. The same principle is present in biology. If a person repeats the same action on a regular basis in response to the same cue and reward, it will become a habit as the corresponding neural pathway is formed. From then on, their brain will use less energy to complete the same action.

The History of Occam’s Razor

The concept of Occam’s razor is credited to William of Ockham, a 14th-century friar, philosopher, and theologian. While he did not coin the term, his characteristic way of making deductions inspired other writers to develop the heuristic. Indeed, the concept of Occam’s razor is an ancient one. Aristotle produced the oldest known statement of the concept, saying, “We may assume the superiority, other things being equal, of the demonstration which derives from fewer postulates or hypotheses.”

Robert Grosseteste expanded on Aristotle’s writing in the 1200s, declaring

That is better and more valuable which requires fewer, other circumstances being equal…. For if one thing were demonstrated from many and another thing from fewer equally known premises, clearly that is better which is from fewer because it makes us know quickly, just as a universal demonstration is better than particular because it produces knowledge from fewer premises. Similarly, in natural science, in moral science, and in metaphysics the best is that which needs no premises and the better that which needs the fewer, other circumstances being equal.

Nowadays, Occam’s razor is an established mental model which can form a useful part of a latticework of knowledge.

Mental Model Occam's Razor

Examples of the Use of Occam’s Razor

The Development of Scientific Theories

Occam’s razor is frequently used by scientists, in particular for theoretical matters. The simpler a hypothesis is, the more easily it can be proven or falsified. A complex explanation for a phenomenon involves many factors which can be difficult to test or lead to issues with the repeatability of an experiment. As a consequence, the simplest solution which is consistent with the existing data is preferred. However, it is common for new data to allow hypotheses to become more complex over time. Scientists choose to opt for the simplest solution as the current data permits, while remaining open to the possibility of future research allowing for greater complexity.

The version used by scientists can best be summarized as:

When you have two competing theories that make exactly the same predictions, the simpler one is better.

The use of Occam’s razor in science is also a matter of practicality. Obtaining funding for simpler hypotheses tends to be easier, as they are often cheaper to prove.

Albert Einstein referred to Occam’s razor when developing his theory of special relativity. He formulated his own version: “It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.” Or, “Everything should be made as simple as possible, but not simpler.”

The physicist Stephen Hawking advocates for Occam’s razor in A Brief History of Time:

We could still imagine that there is a set of laws that determines events completely for some supernatural being, who could observe the present state of the universe without disturbing it. However, such models of the universe are not of much interest to us mortals. It seems better to employ the principle known as Occam’s razor and cut out all the features of the theory that cannot be observed.

Isaac Newton used Occam’s razor too when developing his theories. Newton stated: “We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.” He sought to make his theories, including the three laws of motion, as simple as possible, with only the necessary minimum of underlying assumptions.

Medicine

Modern doctors use a version of Occam’s razor, stating that they should look for the fewest possible causes to explain their patient’s multiple symptoms, and give preference to the most likely causes. A doctor we know often repeats the aphorism that “common things are common.” Interns are instructed, “when you hear hoofbeats, think horses, not zebras.” For example, a person displaying influenza-like symptoms during an epidemic would be considered more likely to be suffering from influenza than an alternative, rarer disease. Making minimal diagnoses reduces the risk of over-treating a patient, causing panic, or causing dangerous interactions between different treatments. This is of particular importance within the current medical model, where patients are likely to see numerous health specialists and communication between them can be poor.

Prison Abolition and Fair Punishment

Occam’s razor has long played a role in attitudes towards the punishment of crimes. In this context, it refers to the idea that people should be given the least punishment necessary for their crimes. This is to avoid the excessive penal practices which were popular in the past. For example, a 19th-century English convict could receive five years of hard labor for stealing a piece of food.

The concept of penal parsimony was pioneered by Jeremy Bentham, the founder of utilitarianism. He held that punishments should not cause more pain than they prevent. Life imprisonment for murder could be seen as justified in that it might prevent a great deal of potential pain, should the perpetrator offend again. On the other hand, long-term imprisonment of an impoverished person for stealing food causes substantial suffering without preventing any.

Bentham’s writings on the application of Occam’s razor to punishment led to the prison abolition movement and many modern ideas related to rehabilitation.

Exceptions and Issues

It is important to note that, like any mental model, Occam’s razor is not foolproof. Use it with care, lest you cut yourself. This is especially crucial when it comes to important or risky decisions. There are exceptions to any rule, and we should never blindly follow the results of applying a mental model which logic, experience, or empirical evidence contradict. When you hear hoofbeats behind you, in most cases you should think horses, not zebras—unless you are out on the African savannah.

Furthermore, simple is as simple does. A conclusion can’t rely just on its simplicity. It must be backed by empirical evidence. And when using Occam’s razor to make deductions, we must avoid falling prey to confirmation bias. In the case of the NASA moon landing conspiracy theory, for example, some people consider it simpler for the moon landing to have been faked, others for it to have been real. Lisa Randall best expressed the issues with the narrow application of Occam’s razor in her book, Dark Matter and the Dinosaurs: The Astounding Interconnectedness of the Universe:

Another concern about Occam’s Razor is just a matter of fact. The world is more complicated than any of us would have been likely to conceive. Some particles and properties don’t seem necessary to any physical processes that matter—at least according to what we’ve deduced so far. Yet they exist. Sometimes the simplest model just isn’t the correct one.

This is why it’s important to remember that opting for simpler explanations still requires work. They may be easier to falsify, but still require effort. And that the simpler explanation, although having a higher chance of being correct, is not always true.

Occam’s razor is not intended to be a substitute for critical thinking. It is merely a tool to help make that thinking more efficient. Harlan Coben has disputed many criticisms of Occam’s razor by stating that people fail to understand its exact purpose:

Most people oversimplify Occam’s razor to mean the simplest answer is usually correct. But the real meaning, what the Franciscan friar William of Ockham really wanted to emphasize, is that you shouldn’t complicate, that you shouldn’t “stack” a theory if a simpler explanation was at the ready. Pare it down. Prune the excess.

Remember, Occam’s razor is complemented by other mental models, including fundamental error distribution, Hanlon’s razor, confirmation bias, availability heuristic and hindsight bias. The nature of mental models is that they tend to all interlock and work best in conjunction.

Reciprocation Bias

“There are slavish souls who carry their appreciation for favors done
them so far that they strangle themselves with the rope of gratitude.”

—Friedrich Nietzsche

***

If you are like me, whenever receiving a favor, you too feel an immense need, almost an obligation, to pay it back in kind.

If a friend invites you over for dinner, you are almost sure to invite them over to your place for dinner as well. It almost seems as if we were meant to do each other favors and, more important, return them.

Have you ever wondered why?

A large part of the reason is that this behavior seems to have strong evolutionary benefits. It’s so pervasive in human culture, it’s believed that there is no society that does not feel reciprocation’s pull. The archaeologist Richard Leakey believes reciprocation is the foundation on which we have evolved: “We are human because our ancestors learned to share their food and their skills in an honored network of obligation.”

The web of indebtedness created by reciprocation allows for the division of tasks, eases the exchange of goods and services, and helps create interdependencies that bind us into units that are more productive than each of us is on our own. Reciprocation allows one person to give something to another with the expectation that the favor will be returned and the giver will not be taken advantage of.

Throughout human history, reciprocation lowered the cost of transactions, as almost everything begins with one person trusting another. Land could be farmed with one person lending seeds to another. Gifts could be given. Currency could be lent. Aid could be given to the weak. Moreover, reciprocation is not a human concept — it exists in the physical world. Newton’s third law is that for every action there is an equal and opposite reaction. You might push on a wall, but the wall pushes back on you.

There is such an advantage to be gained from reciprocation that it’s become imprinted onto our subconscious. For example, we teach our kids to invite others they may not like to their birthday parties because our kids were invited to those kids’ parties. Deeper still, we negatively label people who violate the rule: untrustworthy, moocher, welsher. Because social sanctions can be tough on those who fail to cooperate, the rule of reciprocity often evokes guilt.

As with most things, however, reciprocation has a darker side. Just as we tend to reciprocate good behavior, sometimes we also pay back bad deeds. One of the most effective game-theory strategies is tit for tat.

“Pay every debt, as if God wrote the bill.”

— Ralph Waldo Emerson

Hate

The reciprocation of bad behavior is best evidenced in wars. Brutality escalates as each side feels obliged to return the violence it experienced from its counterpart. This spiral can lead to more mindlessly destructive behavior, including torture and mass deaths. There are plenty of examples of this negative reciprocation; consider World War II, the Crusades, and the Mongolian invasions led by Genghis Khan.

It might seem that we humans have exclusively caused much suffering in the world in a relatively short period of time. However, the reciprocation rule is overarching — the human species is not the only one capable of extreme cruelty. Charlie Munger believes that reciprocal aggression appears to be more of a rule rather than an exception among other species, too:

One interesting mental exercise is to compare Genghis Khan, who exercised extreme, lethal hostility toward other men, with ants that display extreme, lethal hostility toward members of their own species that are not part of their breeding colony. Genghis looks sweetly lovable when compared to the ants. The ants are more disposed to fight and fight with more extreme cruelty.

If the reciprocation rule is so overpowering, the natural question here would be, is there a way we can still control our response to it?

Munger advises us to train our patience.

The standard antidote to one’s overactive hostility is to train oneself to defer reaction. As my smart friend Tom Murphy so frequently says, “You can always tell the man off tomorrow if it is such a good idea.”

There’s also another way. Because the reciprocation tendency is so extreme, we can reverse the course of events by doing good rather than harm to the other party.

Particularly in WWI, the fighting sometimes paused after a positive feedback loop of less severe damage occurred. Here is how a British staff officer described his surprise about the degree of trust between the British and German soldiers:

[I was] astonished to observe German soldiers walking about within rifle range behind their own line. Our men appeared to take no notice. I privately made up my mind to do away with that sort of thing when we took over; such things should not be allowed. These people evidently did not know there was a war on. Both sides apparently believed in the policy of “live and let live.” (Dugdale 1932, p. 94)

Such behavior was not restricted to this one case, but was rather common in trench warfare during the later stages of the war.

And this makes me think that if such things could happen even during a war, there is little doubt that we could improve our relationships by doing a little undeserved good for the other person.

Love

Reciprocation is just as important in breeding love as it is in breeding hate.

Andy Warhol said, in The Philosophy of Andy Warhol (From A to B and Back Again):

Love affairs get too involved, and they’re not really worth it. But if, for some reason, you feel that they are, you should put in exactly as much time and energy as the other person. In other words, “I’ll pay you if you pay me.”

This is the reciprocation tendency at its finest. Truth is, love and marriage would lose much of their allure if there were no reciprocation tendency among partners. By loving, we literally may become loved.

As lovers and spouses, we promise loyalty to our partners and we expect it to be returned. We are encouraged to practice the virtues of marriage in front of not only our partners, but also society. These effects reinforcing each other can be thought of as the fabric of many of today’s relationships.

Furthermore, reciprocation not only holds us together, but can also bring us together in the first place. Displaying generosity can be a powerful way to advance a relationship by setting up implicit expectations of compliance from the other person.

Women, in particular, often report on the pressure they feel after receiving expensive gifts or dinners. In Influence, professor of psychology Robert Cialdini quotes the words of one of his (female) students:

After learning the hard way, I no longer let a guy I meet in a club buy me a drink because I don’t want either of us to feel that I am obligated sexually.

Perhaps the key to genuine relationships lies at least partially in each party being kind without expectations. Indeed, in communal relationships like marriage, friendship, and the parent-child relationship, the accounting is unnecessary, and if you think about it, you’ll see that it is hardly ever practiced.

What is exchanged reciprocally instead is the near-unconditional willingness to provide what the other side needs, when it is needed. Still, some symmetry seems to be best; even in close friendships, strong inequalities will eventually make themselves noticed.

Abusing Reciprocity

As with any human tendency, reciprocity holds a great potential for abuse. Charlie Munger recalls how the eccentric hedge-fund manager Victor Niederhoffer managed to get good grades with an impressive course load when he was an undergraduate student at Harvard.

Contrary to what one may expect, Niederhoffer was not a very hard-working student. Instead of studying, he liked spending his time playing world-class checkers, gambling in high-stakes card games, and playing amateur-level tennis and professional-level squash. So how did he manage to get those good grades?

Munger explains:

He thought he was up to outsmarting the Harvard Economics Department. And he was. He noticed that the graduate students did most of the boring work that would otherwise go to the professors, and he noticed that because it was so hard to get to be a graduate student at Harvard, they were all very brilliant and organized and hard working, as well as much needed by grateful professors.

And therefore, by custom, and as would be predicted from the psychological force called reciprocity tendency, in a really advanced graduate course, the professors always gave an A. So Victor Niederhoffer signed up for nothing but the most advanced graduate courses in the Harvard Economics Department, and of course, he got A, after A, after A, after A, and was hardly ever near a class. And for a while, some people at Harvard may have thought it had a new prodigy on its hands. That’s a ridiculous story, but the scheme will work still. And Niederhoffer is famous: they call his style “Niederhoffering the curriculum.”

There are cases that are less innocent than Niederhoffer’s gaming the system. For example, when a salesman offers us a cup of coffee with cookies, we are likely to be subconsciously tricked into compliance by even such a minor favor, which combines reciprocity and association. Buying can be just as much about the actual experience as it is about acquiring goods at an optimal price, and salesmen know this.

Your Costs Are My Benefits

In our personal expenses, we are the ones suffering from our follies, but an important problem arises when we buy on someone else’s behalf. Imagine that you are the purchasing agent for an employer. Now the extra costs that are paid in return for the minor favor you receive are incurred not by you but by your employer.

Gifts and favors tend to create perverse incentives on the purchaser’s part and allow the seller to maximize his advantage. Smart employers know this and therefore do not allow their purchasing personnel to accept gifts. Sam Walton is one notable example; he wouldn’t let Walmart’s purchasing agents accept even a hot dog from a vendor.

The exchange of favors at another’s expense is not restricted to purchasing on someone’s behalf.

Munger notes that the reciprocation tendency can also be held responsible for some wicked pay dynamics in the boardroom of public companies:

It’s incredible the reciprocity that happens when CEOs keep recommending that directors get paid more, and then the directors raise the CEO’s pay — it’s a big game of pitty pat. And then they hire compensation consultants to make sure no-one else is getting paid more. This is true even if the CEO is a klutz and a little dishonorable. I think the existing system is very bad and my system would work better, but it’s not going to happen.

In order to prevent these dynamics, he suggests that the board of directors does not get paid at all.

I think tons of eminent people would serve on boards of companies like Exxon without being paid. The lower courts in England are run by unpaid magistrates. And Harvard is run by boards of people who don’t get paid — in fact, they have to pay [in the form of donations to the school]. I think boards would be better if they were run like Berkshire Hathaway’s.

For these same reasons, Munger believes that the reciprocity tendency should be part of the compulsory law curriculum; otherwise, students may unknowingly steer away from representing their clients’ best interests. Ignorance of the reciprocation rule may explain why malpractice still occurs even among lawyers with the best intentions. The law schools simply don’t know, or care to teach, what Sam Walton knew so well.

The Concession

Besides the obvious doing of favors, there is a more subtle technique that may lure us into reciprocal and cooperative behavior. Rob Cialdini recalls an incident that made him aware of the technique:

I was walking down the street when I was approached by an 11- or 12-year-old boy. He introduced himself and said he was selling tickets to the annual Boy Scouts Circus to be held on the upcoming Saturday night. He asked if I wished to buy any tickets at $5 apiece. Since one of the last places I wanted to spend Saturday evening was with the Boy Scouts, I declined. “Well,” he said, “if you don’t want to buy any tickets, how about buying some of our chocolate bars? They’re only $1 each.”

Cialdini automatically bought two chocolates and immediately realized that something was wrong:

I knew that to be the case because (a) I do not like chocolate bars; (b) I do like dollars; (c) I was standing there with two of his chocolate bars; and (d) he was walking away with two of my dollars.

After meeting with his research assistants and conducting experiments with a similar setup on his students, Cialdini arrived at a rule that explains this behavior: The person who acts in a certain way toward us is entitled to a similar return action.

The person who acts in a certain way toward us is entitled to a similar return action.

This rule has two consequences:

  1. We feel obliged to repay favors we have received.
  2. We feel obliged to make a concession to someone who has made a concession to us.

As Cialdini and his research group reflected, they increasingly saw that the Boy Scout had brought him under the rule. The request to purchase the chocolates was introduced as a concession — a retreat from the request that Cialdini buy some $5 tickets.

If Cialdini was to live up to the dictates of the reciprocation rule, there had to be a concession on his part. And there was — after all, Cialdini moved from rejection to compliance after the boy had moved from a larger to a smaller request. The remarkable thing, and this is where bias comes in, was that Cialdini was not at all interested in either of the things the boy had offered.

Why would this rule be so important? Because it can lead to a lot of unnecessary trouble.

Both Cialdini and Munger believe that a subconscious reciprocation tendency was an important lever that allowed Watergate, one of the biggest political scandals in history, to occur.

Breaking into the Watergate offices of the Democratic party was a plan that was conceived by G. Gordon Liddy, an aggressive subordinate with a questionable reputation. Liddy pulled the same trick on his superiors that the twelve-year-old boy did on Cialdini. The $250,000 break-in plan was not the first that Liddy proposed — it was a significant concession from the previous two. The first of these plans, for $1 million, entailed a program that included a specially equipped “chase plane,” break-ins, kidnapping and mugging squads, and a yacht featuring “high-class call girls,” all meant to blackmail the Democratic politicians.

The second plan was a little more modest, at half of the initial price and reductions in the program. After the two initial plans were rejected by his superiors, Liddy submitted the third, “bare bones” plan, which was a little less stupid and cost “a mere” quarter of the initial price.

Do you see what Liddy did there?

Unsurprisingly, his superiors gave in; eventually, the plan was approved and it started the snowball that caused Nixon to resign. As the Watergate example illustrates, an unwatched reciprocation tendency may subtly cause mindless behavior with many extreme or dangerous consequences.

***

One of the reasons reciprocation can be used so effectively as a device for gaining another’s compliance is that it combines power and subtlety. Especially in its concessionary form, the reciprocation rule often produces a yes response to a request that otherwise would surely have been refused.

I hope that the next time you come across a situation where you feel the need to return a favor, you will think twice about the possible consequences of accepting it in the first place. You may think, for example, that someone offering you a free pen will not influence you at all, but there is an entire human history arguing otherwise. Perhaps Sam Walton’s policy, of not accepting favors at all in matters where impartiality is preferred, is best.

Yet there is some truth to saying that reciprocal behavior also represents the best part of human nature. There are times when successful trade, good friendships, and even romantic relationships develop out of the need to feel symmetrical in our relationships. Indeed, it could well be that the very best parts of our lives lie in relationships of affection in which both we and the other party want to please each other.

Bias from Association: Why We Shoot the Messenger

We automatically connect a stimulus (thing/person) with pain (fear) or pleasure (hope). As pleasure seeking animals, we seek out positive associations and attempt to remove negative ones. This happens when we experience the positive or negative consequences of a stimulus. The more vivid the event, the easier it is to remember. Brands (including people) attempt to influence our behavior by associating with positive things. 

***

Bias from Association

Our life and memory revolve around associations. The smell of a good lunch makes our stomach growl, the songs we hear remind us about the special times that we have had and horror movies leave us with goosebumps.

These natural, uncontrolled responses upon a specific signal are examples of classical conditioning. Classical conditioning, or in simple terms — learning by association, was discovered by a Russian scientist Ivan Petrovich Pavlov. Pavlov was a physiologist whose work on digestion in dogs won him a Nobel Prize in 1904.

In the course of his work in physiology, Pavlov made an accidental observation that dogs started salivating even before their food was presented to them.

With repeated testing, he noticed that the dogs began to salivate in anticipation of a specific signal, such as the footsteps of their feeder or if conditioned that way, even after the sound of a tone.

Pavlov’s genius lay in his ability to understand the implications of his discovery. He knew that dogs have a natural reflex of salivating to food but not to footsteps or tones. He was on to something. Pavlov realized that, if coupling the two signals together induced the same reactive response in dogs, then other physical reactions may be inducible via similar associations.

In effect, with Pavlovian association, we respond to a stimulus because we anticipate what comes next: the reality that would make our response correct.

Now things get interesting.

Rules of Conditioning

Suppose we want to condition a dog to salivate to a tone. If we sound the tone without having taught the dog to specifically respond, the ears of the dog might move, but the dog will not salivate. The tone is just a neutral stimulus, at this point. On the other hand, food for the dog is an unconditioned stimulus, because it always makes the dog salivate.

If we now pair the arrival of food and the sound of the tone, we elicit a learning trial for the dog. After several such trials, the association develops and is strong enough to make the dog salivate even though there is no food. The tone, at this point, has become a conditioned stimulus. This is learned hope. Learned fear is more easily acquired.

The speed and degree to which the dog learns to display the response will depend on several factors.

The best results come when the conditioned stimulus is paired with the unconditioned one several times. This develops a strong association. It takes time for our brains to detect specific patterns.

Classical conditioning involves automatic or reflexive responses and not voluntary behavior.

There are also cases to which this principle does not apply. When we undergo high impact events, such as a car crash, robbery, or firing from a job, a single event will be enough to create a strong association.

Why We Shoot The Messenger

One of our goals should be to understand how the world works. A necessary condition for this is understanding our problems. However, sometimes people are afraid to tell us problems.

This is also known at The Pavlovian Messenger Syndrome.

The original messenger wasn’t shot, he was beheaded. In Plutarch’s Lives, we find:

The first messenger, that gave notice of Lucullus’ coming was so far from pleasing Tigranes that, he had his head cut off for his pains; and no man dared to bring further information. Without any intelligence at all, Tigranes sat while war was already blazing around him, giving ear only to those who flattered him.

The number of times that happens in an organization is countless. A related sentiment exists in Antigone by Sophocles as “No one loves the messenger who brings bad news.”

In a lesson on elementary, worldly wisdom, Charlie Munger said:

If people tell you what you really don’t want to hear — what’s unpleasant —there’s an almost automatic reaction of antipathy. You have to train yourself out of it. It isn’t foredestined that you have to be this way. But you will tend to be this way if you don’t think about it.

In Antony and Cleopatra, when told Antony has married another, Cleopatra threatens to treat the messenger poorly, eliciting the response “Gracious madam, I that do bring the news made not the match.”

And the advice “Don’t shoot the messenger” appears in Henry IV, Part 2.

If you happen to be the messenger, it might be best to deliver the news first via and appear in person later to minimize the negative feelings towards you.

If, on the other hand, you’re the receiver of bad news, it’s best to follow the advice of Warren Buffett, who comments on being informed of bad news:

We only give a couple of instructions to people when they go to work for us: One is to think like an owner. And the second is to tell us bad news immediately — because good news takes care of itself. We can take bad news, but we don’t like it late.

Pavlov showed that sequence matters: the association is most clear to us when the conditioned stimulus appears first and remains after the unconditioned stimulus is introduced.

Unsurprisingly, our learning responses become weaker if the two stimuli are introduced at the same time and are even slower if they are presented in reverse (unconditioned then conditioned stimulus) order.

Attraction and Repulsion

There’s no doubt that classical conditioning influences what attracts us and even arouses us. Most of us will recognize that images and videos of kittens will make our hearts softer and perfume or a look from our partner can make our hearts beat faster.

Charlie Munger explains the case of building Coca-Cola, whose marketing and product strategy is built on strong foundations of conditioning.

Munger walks us through the creation of the brand by using conditioned reflexes:

The neural system of Pavlov’s dog causes it to salivate at the bell it can’t eat. And the brain of man yearns for the type of beverage held by the pretty woman he can’t have. And so, Glotz, we must use every sort of decent, honorable Pavlovian conditioning we can think of. For as long as we are in business, our beverage and its promotion must be associated in consumer minds with all other things consumers like or admire.

By repeatedly pairing a product or brand with a favorable impression, we can turn it into a conditioned stimulus that makes us buy.

This goes even beyond advertising — conditioned reflexes are also encompassed in Coca Cola’s name. Munger continues:

Considering Pavlovian effects, we will have wisely chosen the exotic and expensive-sounding name “Coca-Cola,” instead of a pedestrian name like “Glotz’s Sugared, Caffeinated Water.”

And even texture and taste:

And we will carbonate our water, making our product seem like champagne, or some other expensive beverage, while also making its flavor better and imitation harder to arrange for competing products.

Combining these and other clever, non-Pavlovian techniques lead to what Charlie Munger calls the lollapalooza effect, causing so many consumers to buy and making Coca-Cola an excellent business for over a century.

While Coca-Cola has some of its advantages rooted in positive Pavlovian association, there are cases when associations do no good. In childhood, many of us were afraid of doctors or dentists, because we quickly learned to associate these visits with pain.

While we may have lost our fear of dentists, by now many of us experience similarly unpleasant feelings when opening a letter from the police or anticipating a negative performance review.

Constructive criticism can be one of life’s great gifts and an engine for improvement, however, before we can benefit from it, we must be prepared that some of it will hurt. If we are not at least implicitly aware of the conditioning phenomena and have people telling us what we don’t want to hear, we may develop a certain disliking to those delivering the news.

The number of people in leadership positions unable to detach the information from the messenger can be truly surprising. In The Psychology of Human Misjudgement, Munger tells about the ex-CEO of CBS, William Paley, who had a blind spot for ideas that did not align with his views.

Television was dominated by one network-CBS-in its early days. And Paley was a god. But he didn’t like to hear what he didn’t like to hear, and people soon learned that. So they told Paley only what he liked to hear. Therefore, he was soon living in a little cocoon of unreality and everything else was corrupt although it was a great business.

In the case of Paley, his inability to take criticism and recognize incentives was soon noticed by those around him, and it resulted in sub-optimal outcomes.

… If you take all the acquisitions that CBS made under Paley after the acquisition of the network itself, with all his dumb advisors-his investment bankers, management consultants, and so forth, who were getting paid very handsomely-it was absolutely terrible.

Paley is by no means the only example of such dysfunction in the high ranks of business. In fact, the higher up you are in an organization, the more people fear telling you the truth. Providing sycophants with positive reinforcement will only encourage this behavior and ensure you’re insulated from reality.

To make matters worse, as we move up in seniority, we also tend to become more confident about our own judgments being correct. This is a dangerous tendency, but we need not be bound by it.

We can train ourselves out of it with reflection and effort.

Escaping Associations

No doubt that learning via associations is crucial for our survival — it alerts us about the arrival of an important event and gives us time to prepare for the appropriate response.

Sometimes, however, learned associations do not serve us and our relationships well. We find that we have become subject to negative responses in others or recognize unreasonable responses in ourselves.

Awareness and understanding may serve as good first steps. Yet, even when taken together, they may not be sufficient to unlearn some of the more stubborn associations. In such cases, we may want to try several known techniques to desensitize them or reverse their negative effects.

One way to proceed is via habituation.

When we habituate someone, we blunt down their conditioned response by exposing them to the specific stimulus pairing continuously. After a while, they simply stop responding. This loss of interest is a natural learning response that allows us to conserve energy for stimuli that are unfamiliar and therefore draw the attention of the mind.

Continuous exposure can yield results as powerful as becoming fully indifferent to stimuli as strong as violence and death.

In Man’s Search For Meaning, Viktor Frankl, a holocaust survivor, tells about experiencing absolute desensitization to the most horrific events imaginable:

Disgust, horror and pity are emotions that our spectator [Frankl] could not really feel any more. The sufferers, the dying and the dead, became such commonplace sights to him after a few weeks of camp life that they could not move him any more.

Of course, habituation can also serve good motives, such as getting ourselves over fear, overcoming trauma, or harmonizing relationships by making each side less sensitive to the other side’s vices. However, as powerful as habituation is, we must recognize its limitations.

If we want someone to respond differently rather than become indifferent, flooding them with stimuli will not help us achieve our aims.

Consider the case for teaching children – the last thing we would want is to make them indifferent to what we say. Therefore instead of habituation, we should employ another strategy.

A frequently used technique in coaching, exposure therapy, involves cutting back our criticism for a while and reintroducing it by gradually lowering the person’s threshold for becoming defensive.

The key difference between exposure therapy and habituation lies in being subtle rather than blunt.

If we try to avoid forming negative associations and achieve behavioral change at the same time, we will always want the positive vs. negative feedback ratio to be in favor of the positive. This is why we so often provide feedback in a “sandwich,” where a positive remark is followed by what must be improved and then finished with another positive remark.

Aversion therapy is the exact opposite of exposure therapy.

Aversion therapy aims to exchange a positive association with a negative one within a few high impact events. For example, some parents teach out a sweet tooth by forcing their children to consume an insurmountable amount of sweets in one sitting under their supervision.

While ethically questionable this idea is not completely unfounded.

If the experience is traumatic enough, the positive associations of, for example, a sugar high, will be replaced by the negative association of nausea and sickness.

This controversial technique was used in experiments with alcoholics. While effective in theory, it was known to yield only mixed results in practice, with patients often resorting back to past conditions over time.

This is also why there are gross and terrifying pictures on cigarette packages in many countries.

Overall, creating habits that last or permanently breaking them can be a tough mission to embark upon.

In the case of feedback, we may try to associate our presence with positive stimuli, which is why building great first impressions and appearing friendly matters.

Keep in Mind

When thinking about this bias it’s important to keep in mind that: (1) people are neither good nor bad because we associate something positive or negative to them; (2) bad news should be sought immediately and your reaction to it will dictate how much of it you hear; (3) to end a certain behavior or habit you can create an association with a negative emotion.

***

Still Curious? Checkout the Farnam Street Latticework of Mental Models

Mental Model: Misconceptions of Chance

Misconceptions of Chance

We expect the immediate outcome of events to represent the broader outcomes expected from a large number of trials. We believe that chance events will immediately self-correct and that small sample sizes are representative of the populations from which they are drawn. All of these beliefs lead us astray.

***

 

Our understanding of the world around us is imperfect and when dealing with chance our brains tend to come up with ways to cope with the unpredictable nature of our world.

“We tend,” writes Peter Bevelin in Seeking Wisdom, “to believe that the probability of an independent event is lowered when it has happened recently or that the probability is increased when it hasn’t happened recently.”

In short, we believe an outcome is due and that chance will self-correct.

The problem with this view is that nature doesn’t have a sense of fairness or memory. We only fool ourselves when we mistakenly believe that independent events offer influence or meaningful predictive power over future events.

Furthermore we also mistakenly believe that we can control chance events. This applies to risky or uncertain events.

Chance events coupled with positive reinforcement or negative reinforcement can be a dangerous thing. Sometimes we become optimistic and think our luck will change and sometimes we become overly pessimistic or risk-averse.

How do you know if you’re dealing with chance? A good heuristic is to ask yourself if you can lose on purpose. If you can’t you’re likely far into the chance side of the skill vs. luck continuum. No matter how hard you practice, the probability of chance events won’t change.

“We tend,” writes Nassim Taleb in The Black Swan, “to underestimate the role of luck in life in general (and) overestimate it in games of chance.”

We are only discussing independent events. If events are dependent, where the outcome depends on the outcome of some other event, all bets are off.

Misconceptions of Chance

Daniel Kahneman coined the term misconceptions of chance to describe the phenomenon of people extrapolating large-scale patterns to samples of a much smaller size. Our trouble navigating the sometimes counterintuitive laws of probability, randomness, and statistics leads to misconceptions of chance.

Kahneman found that “people expect that a sequence of events generated by a random process will represent the essential characteristics of that process even when the sequence is short.”

In the paper Belief in the Law of Small Numbers, Kahneman and Tversky reflect on the results of an experiment, where subjects were instructed to generate a random sequence of hypothetical tosses of a fair coin.

They [the subjects] produce sequences where the proportion of heads in any short segment stays far closer to .50 than the laws of chance would predict. Thus, each segment of the response sequence is highly representative of the “fairness” of the coin.

Unsurprisingly, the same nature of errors occurred when the subjects, instead of being asked to generate sequences themselves, were simply asked to distinguish between random and human generated sequences. It turns out that when considering tosses of a coin for heads or tails people regard the sequence H-T-H-T-T-H to be more likely than the sequence H-H-H-T-H-T, which does not appear random, and also more likely than the sequence H-H-H-H-T-H. In reality, each one of those sequences has the exact same probability of occurring. This is a misconception of chance.

The aspect that most of us find so hard to grasp about this case is that any pattern of the same length is just as likely to occur in a random sequence. For example, the odds of getting 5 tails in a row are 0.03125 or simply stated 0.5 (the odds of a specific outcome at each trial) to the power of 5 (number of trials).

The same probability rule applies for getting the specific sequences of HHTHT or THTHT – where each sequence is obtained by once again taking 0.5 (the odds of a specific outcome at each trial) to the power of 5 (number of trials), which equals 0.03125.

This probability is true for sequences – but it implies no relation between the odds of a specific outcome at each trial and the representation of the true proportion within these short sequences.

Yet it’s still surprising. This is because people expect that the single event odds will be reflected not only in the proportion of events as a whole but also in the specific short sequences we encounter. But this is not the case. A perfectly alternating sequence is just as extraordinary as a sequence with all tails or all heads.

In comparison, “a locally representative sequence,” Kahneman writes, in Thinking, Fast and Slow, “deviates systematically from chance expectation: it contains too many alternations and too few runs. Another consequence of the belief in local representativeness is the well-known gambler’s fallacy.”

Gambler’s Fallacy

There is a specific variation of the misconceptions of chance that Kahneman calls the Gambler’s fallacy (elsewhere also called the Monte Carlo fallacy).

The gambler’s fallacy implies that when we come across a local imbalance, we expect that the future events will smoothen it out. We will act as if every segment of the random sequence must reflect the true proportion and, if the sequence has deviated from the population proportion, we expect the imbalance to soon be corrected.

Kahneman explains that this is unreasonable – coins, unlike people, have no sense of equality and proportion:

The heart of the gambler’s fallacy is a misconception of the fairness of the laws of chance. The gambler feels that the fairness of the coin entitles him to expect that any deviation in one direction will soon be cancelled by a corresponding deviation in the other. Even the fairest of coins, however, given the limitations of its memory and moral sense, cannot be as fair as the gambler expects it to be.

He illustrates this with an example of the roulette wheel and our expectations when a reasonably long sequence of repetition occurs.

After observing a long run of red on the roulette wheel, most people erroneously believe that black is now due, presumably because the occurrence of black will result in a more representative sequence than the occurrence of an additional red.

In reality, of course, roulette is a random, non-evolving process, in which the chance of getting a red or a black will never depend on the past sequence. The probabilities restore after each run, yet we still seem to take the past moves into account.

Contrary to our expectations, the universe does not keep accounting of a random process so streaks are not necessarily tilted towards the true proportion. Your chance of getting a red after a series of blacks will always be equal to that of getting another red as long as the wheel is fair.

The gambler’s fallacy need not to be committed inside the casino only. Many of us commit it frequently by thinking that a small, random sample will tend to correct itself.

For example, assume that the average IQ at a specific country is known to be 100. And for the purposes of assessing intelligence at a specific district, we draw a random sample of 50 persons. The first person in our sample happens to have an IQ of 150. What would you expect the mean IQ to be for the whole sample?

The correct answer is (100*49 + 150*1)/50 = 101. Yet without knowing the correct answer, it is tempting to say it is still 100 – the same as in the country as a whole.

According to Kahneman and Tversky such expectation could only be justified by the belief that a random process is self-correcting and that the sample variation is always proportional. They explain:

Idioms such as “errors cancel each other out” reflect the image of an active self-correcting process. Some familiar processes in nature obey such laws: a deviation from a stable equilibrium produces a force that restores the equilibrium.

Indeed, this may be true in thermodynamics, chemistry and arguably also economics. These, however, are false analogies. It is important to realize that the laws governed by chance are not guided by principles of equilibrium and the number of random outcomes in a sequence do not have a common balance.

“Chance,” Kahneman writes in Thinking, Fast and Slow, “is commonly viewed as a self-correcting process in which a deviation in one direction induces a deviation in the opposite direction to restore the equilibrium. In fact, deviations are not “corrected” as a chance process unfolds, they are merely diluted.”

The Law of Small Numbers

Misconceptions of chance are not limited to gambling. In fact, most of us fall for them all the time because we intuitively believe (and there is a whole best-seller section at the bookstore to prove) that inferences drawn from small sample sizes are highly representative of the populations from which they are drawn.

By illustrating people’s expectations of random heads and tails sequences, we already established that we have preconceived notions of what randomness looks like. This, coupled with the unfortunate tendency to believe in self-correcting process in a random sample, generates expectations about sample characteristics and representativeness, which are not necessarily true. The expectation that the patterns and characteristics within a small sample will be representative of the population as a whole is called the law of small numbers.

Consider the sequence:

1, 2, 3, _, _, _

What do you think are the next three digits?

The task almost seems laughable, because the pattern is so familiar and obvious – 4,5,6. However, there is an endless variation of different algorithms that would still fit the first three numbers, such as the Fibonacci sequence (5, 8, 13), a repeated sequence (1,2,3), a random sequence (5,8,2) and many others. Truth is, in this case there simply is not enough information to say what the rules governing this specific sequence are with any reliability.

The same rule applies to sampling problems – sometimes we feel we have gathered enough data to tell a real pattern from an illusion. Let me illustrate this fallacy with yet another example.

Imagine that you face a tough decision between investing in the development of two different product opportunities. Let’s call them Product A or Product B. You are interested in which product would appeal to the majority of the market, so you decide to conduct customer interviews. Out of the first five pilot interviews, four customers show a preference for Product A. While the sample size is quite small, given the time pressure involved, many of us would already have some confidence in concluding that the majority of customers would prefer Product A.

However, a quick statistical test will tell you that the probability of a result just as extreme is in fact 3/8, assuming that there is no preference among customers at all. This in simple terms means that if customers had no preference between Products A and B, you would still expect 3 customer samples out of 8 to have four customers vouching for Product A.

Basically, a study of such size has little to no predictive validity – these results could easily be obtained from a population with no preference for one or the other product. This, of course, does not mean that talking to customers is of no value. Quite the contrary – the more random cases we examine, the more reliable and accurate the results of the true proportion will be. If we want absolute certainty we must be prepared for a lot of work.

There will always be cases where a guesstimate based on a small sample will be enough because we have other critical information guiding the decision-making process or we simply do not need a high degree of confidence. Yet rather than assuming that the samples we come across are always perfectly representative, we must treat random selection with the suspicion it deserves. Accepting the role imperfect information and randomness play in our lives and being actively aware of what we don’t know already makes us better decision makers.

Bias from Overconfidence: A Mental Model

overconfidence

“What a Man wishes, he will also believe” – Demosthenes

Bias from overconfidence is a natural human state. All of us believe good things about ourselves and our skills. As Peter Bevelin writes in Seeking Wisdom:

Most of us believe we are better performers, more honest and intelligent, have a better future, have a happier marriage, are less vulnerable than the average person, etc. But we can’t all be better than average.

This inherent base rate of overconfidence is especially strong when projecting our beliefs about our future. Over-optimism is a form of overconfidence. Bevelin again:

We tend to Overestimate our ability to predict the future. People tend to put a higher probability on desired events than undesired events.

The bias from overconfidence is insidious because of how many factors can create and inflate it. Emotional, cognitive and social factors all influence it. Emotional, as we see, because of the emotional pain of believing bad things about ourselves, or in our lives.

Emotional and Cognitive distortion that creates overconfidence is the dangerous and unavoidable accompaniment to any form of success.

Roger Lowenstein writes in When Genius Failed, “there is nothing like success to blind one to the possibility of failure.”

In Seeking Wisdom Bevelin writes:

What tends to inflate the price that CEOs pay for acquisitions? Studies found evidence of infection through three sources of hubris: 1) overconfidence after recent success, 2) a sense of self-importance; the belief that a high salary compared to other senior ranking executives implies skill, and 3) the CEOs belief in their own press coverage. The media tend to glorify the CEO and over-attribute business success to the role of the CEO rather than to other factors and people. This makes CEOs more likely to become both more overconfident about their abilities and more committed to the actions that made them media celebrities.

This isn’t an effect confined to CEOs and large transactions. This feedback loop happens every day between employees and their managers. Or between students and professors, even peers and spouses.

Perhaps the most surprising, pervasive, and dangerous reinforcer of Overconfidence is the social incentives. Take a look at this example of social pressures on doctors, from Kahneman in Thinking, Fast and Slow:

Generally, it is considered a weakness and a sign of vulnerability for clinicians to appear unsure. Confidence is valued over uncertainty and there is a prevailing censure against disclosing uncertainty to patients.

An unbiased appreciation of uncertainty is a cornerstone of rationality—but that is now what people and organizations want. Extreme uncertainty is paralyzing under dangerous circumstances, and the admission that one is merely guessing is especially unacceptable when the stakes are high. Acting on pretended knowledge is often the preferred solution.

And what about those who don’t succumb to this social pressure to let the Overconfidence bias run wild?

Kahneman writes, “Experts who acknowledge the full extent of their ignorance may expect to be replaced by more confident competitors, who are better able to gain the trust of the clients.”

It’s important to structure environments that allow for uncertainty, or the system will reward the most overconfident, not the most rational, of the decision-makers.

Making perfect forecasts isn’t the goal–self-awareness is, in the form of wide confidence intervals. Kahneman again in Thinking, Fast and Slow writes:

For a number of years, professors at Duke University conducted a survey in which the chief financial officers of large corporations estimated the results of the S&P index over the following year. The Duke scholars collected 11,600 such forecasts and examined their accuracy. The conclusion was straightforward: financial officers of large corporations had no clue about the short-term future of the stock market; the correlation between their estimates and the true value was slightly less than zero! When they said the market would go down, it was slightly more likely than not that it would go up. these findings are not surprising. The truly bad news is that the CFOs did not appear to know that their forecasts were worthless.

You don’t have to be right. You just have to know that you’re not very likely to be right.

As always with the lollapalooza effect of overlapping, combining, and compounding psychological effects, this one has powerful partners in some of our other mental models. Overconfidence bias is often caused or exacerbated by: doubt-avoidance, inconsistency-avoidance, incentives, denial, believing-first-and-doubting-later, and the endowment effect.

So what are the ways of restraining Overconfidence bias?
The discipline to apply basic math, as prescribed by Munger: “One standard antidote to foolish optimism is trained, habitual use of the simple probability math of Fermat and Pascal, taught in my youth to high school sophomores.” (Pair with Fooled by Randomness).

And in Seeking Wisdom, Bevelin reminds us that “Overconfidence can cause unreal expectations and make us more vulnerable to disappointment.” A few sentences later he advises us to “focus on what can go wrong and the consequences.”

Build in some margin of safety in decisions. Know how you will handle things if they go wrong. Surprises occur in many unlikely ways. Ask: How can I be wrong? Who can tell me if I’m wrong?

Bias from Overconfidence is a Farnam Street mental model.

An Introduction to Complex Adaptive Systems

Let’s explore the concept of the Complex Adaptive Systems and see how this model might apply in various walks of life.

To illustrate what a complex adaptive system is, and just as importantly, what it is not, let’s take the example of a “driving system” – or as we usually refer to it, a car. (I have cribbed some parts of this example from the excellent book by John Miller and Scott Page.)

The interior of a car, at first glance is complicated. There are seats, belts, buttons, levers, knobs, a wheel, etc. Removing the passenger car seats would make this system less complicated. However, the system would remain essentially functional. Thus, we would not call the car interior complex.

The mechanical workings of a car, however, are complex. The system has interdependent components that must all simultaneously serve their function in order for the system to work. The higher-order function, driving, derives from the interaction of the parts in a very specific way.

Let’s say instead of the passenger seats, we remove the timing belt. Unlike the seats, the timing belt is a necessary node for the system to function properly. Our “driving system” is now useless. The system has complexities, but they are not what we would call adaptive.

To understand complex adaptive systems, let’s put hundreds of “driving systems” on the same road, each with the goal of reaching their destination within an expected amount of time. We call this traffic. Traffic is a complex system in which its inhabitants adapt to each other’s actions. Let’s see it in action.

***

On a popular route into a major city, we observe a car in flames on the side of the road, with firefighters working to put out the fire. Naturally, cars will slow to observe the wreck. As the first cars slow, the cars behind them slow in turn. The cars behind them must slow as well. With everyone becoming increasingly agitated, we’ve got a traffic jam. The jam emerges from the interaction of the parts of the system.

With the traffic jam formed, potential entrants to the jam—let’s call them Group #2—get on their smartphones and learn that there is an accident ahead which may take hours to clear. Upon learning of the accident, they predictably begin to adapt by finding another route. Suppose there is only one alternate route into the city. What happens now? The alternate route forms a second jam! (I’m stressed out just writing about this.)

Now let’s introduce a third group of participants, which must choose between jams. Predicting the actions of this third group is very hard to do. Perhaps so many people in group #2 have altered their route that the second jam is worse than the first, causing the majority of the third group to choose jam #1. Perhaps, anticipating that others will follow that same line of reasoning, they instead choose jam #2. Perhaps they stay at home!

What we see here are the emergent properties of a complex adaptive system called traffic. By the time we hit this third layer of participants, predicting the behavior of the system has become extremely difficult, if not impossible.

The key element to complex adaptive systems is the social element. The belts and pulleys inside a car do not communicate with one another and adapt their behavior to the behavior of the other parts in an infinite loop. Drivers, on the other hand, do exactly that.

***

Where else do we see this phenomenon? The stock market is a great example. Instead of describing it myself, let’s use the words of John Maynard Keynes, who brilliantly related the nature of the market’s complex adaptive parts to that of a beauty contest in chapter 12 of The General Theory.

Or, to change the metaphor slightly, professional investment may be likened to those newspaper competitions in which the competitors have to pick out the six prettiest faces from a hundred photographs, the prize being awarded to the competitor whose choice most nearly corresponds to the average preferences of the competitors as a whole; so that each competitor has to pick, not those faces which he himself finds prettiest, but those which he thinks likeliest to catch the fancy of the other competitors, all of whom are looking at the problem from the same point of view. It is not a case of choosing those which, to the best of one’s judgment, are really the prettiest, nor even those which average opinion genuinely thinks the prettiest. We have reached the third degree where we devote our intelligences to anticipating what average opinion expects the average opinion to be. And there are some, I believe, who practice the fourth, fifth and higher degrees.

Like traffic, the complex, adaptive nature of the market is very clear. The participants in the market are interacting with one another constantly and adapting their behavior to what they know about others’ behavior. Stock prices jiggle all day long in this fashion. Forecasting outcomes in this system is extremely challenging.

To illustrate, suppose that a very skilled, influential, and perhaps lucky, market forecaster successfully calls a market crash. (There were a few in 2008, for example.) Five years later, he publicly calls for a second crash. Given his prescience in the prior crash, market participants might decide to sell their stocks rapidly, causing a crash for no other reason than the fact that it was predicted! Like traffic reports on the radio, the very act of observing and predicting has a crucial impact on the behavior of the system.

Thus, although we know that over the long term, stock prices roughly track the value of their underlying businesses, in the short run almost anything can occur due to the highly adaptive nature of market participants.

***

This understanding helps us understand some things that are not complex adaptive systems. Take the local weather. If the Doppler 3000 forecast on the local news predicts rain on Thursday, is the rain any less likely to occur? No. The act of predicting has not influenced the outcome. Although near-term weather is extremely complex, with many interacting parts leading to higher-order outcomes, it does have an element of predictability.

On the other hand, we might call the Earth’s climate partially adaptive, due to the influence of human beings. (Have the cries of global warming and predictions of it worsening not begun affecting the very behavior causing the warming?)

Thus, behavioral dynamics indicate a key difference between weather and climate, and between systems that are simply complex and those that are also adaptive. Failure to use higher-order thinking when considering outcomes in complex adaptive systems is a common cause of overconfidence in prediction making.

***

Complex Adaptive Systems are part of the Farnam Street latticework of Mental Models.