Tag: Mathematics

The Value of Probabilistic Thinking: Spies, Crime, and Lightning Strikes

Probabilistic Thinking (c) 2018 Farnam Street Media Inc. All rights reserved. May not be used without written permission.

Probabilistic thinking is essentially trying to estimate, using some tools of math and logic, the likelihood of any specific outcome coming to pass. It is one of the best tools we have to improve the accuracy of our decisions. In a world where each moment is determined by an infinitely complex set of factors, probabilistic thinking helps us identify the most likely outcomes. When we know these our decisions can be more precise and effective.

Are you going to get hit by lightning or not?

Why we need the concept of probabilities at all is worth thinking about. Things either are or are not, right? We either will get hit by lightning today or we won’t. The problem is, we just don’t know until we live out the day, which doesn’t help us at all when we make our decisions in the morning. The future is far from determined and we can better navigate it by understanding the likelihood of events that could impact us.

Our lack of perfect information about the world gives rise to all of probability theory, and its usefulness. We know now that the future is inherently unpredictable because not all variables can be known and even the smallest error imaginable in our data very quickly throws off our predictions. The best we can do is estimate the future by generating realistic, useful probabilities. So how do we do that?

Probability is everywhere, down to the very bones of the world. The probabilistic machinery in our minds—the cut-to-the-quick heuristics made so famous by the psychologists Daniel Kahneman and Amos Tversky—was evolved by the human species in a time before computers, factories, traffic, middle managers, and the stock market. It served us in a time when human life was about survival, and still serves us well in that capacity.

But what about today—a time when, for most of us, survival is not so much the issue? We want to thrive. We want to compete, and win. Mostly, we want to make good decisions in complex social systems that were not part of the world in which our brains evolved their (quite rational) heuristics.

For this, we need to consciously add in a needed layer of probability awareness. What is it and how can I use it to my advantage?

There are three important aspects of probability that we need to explain so you can integrate them into your thinking to get into the ballpark and improve your chances of catching the ball:

  1. Bayesian thinking,
  2. Fat-tailed curves
  3. Asymmetries

Thomas Bayes and Bayesian thinking: Bayes was an English minister in the first half of the 18th century, whose most famous work, “An Essay Toward Solving a Problem in the Doctrine of Chances” was brought to the attention of the Royal Society by his friend Richard Price in 1763—two years after his death. The essay, the key to what we now know as Bayes’s Theorem, concerned how we should adjust probabilities when we encounter new data.

The core of Bayesian thinking (or Bayesian updating, as it can be called) is this: given that we have limited but useful information about the world, and are constantly encountering new information, we should probably take into account what we already know when we learn something new. As much of it as possible. Bayesian thinking allows us to use all relevant prior information in making decisions. Statisticians might call it a base rate, taking in outside information about past situations like the one you’re in.

Consider the headline “Violent Stabbings on the Rise.” Without Bayesian thinking, you might become genuinely afraid because your chances of being a victim of assault or murder is higher than it was a few months ago. But a Bayesian approach will have you putting this information into the context of what you already know about violent crime.

You know that violent crime has been declining to its lowest rates in decades. Your city is safer now than it has been since this measurement started. Let’s say your chance of being a victim of a stabbing last year was one in 10,000, or 0.01%. The article states, with accuracy, that violent crime has doubled. It is now two in 10,000, or 0.02%. Is that worth being terribly worried about? The prior information here is key. When we factor it in, we realize that our safety has not really been compromised.

Conversely, if we look at the diabetes statistics in the United States, our application of prior knowledge would lead us to a different conclusion. Here, a Bayesian analysis indicates you should be concerned. In 1958, 0.93% of the population was diagnosed with diabetes. In 2015 it was 7.4%. When you look at the intervening years, the climb in diabetes diagnosis is steady, not a spike. So the prior relevant data, or priors, indicate a trend that is worrisome.

It is important to remember that priors themselves are probability estimates. For each bit of prior knowledge, you are not putting it in a binary structure, saying it is true or not. You’re assigning it a probability of being true. Therefore, you can’t let your priors get in the way of processing new knowledge. In Bayesian terms, this is called the likelihood ratio or the Bayes factor. Any new information you encounter that challenges a prior simply means that the probability of that prior being true may be reduced. Eventually, some priors are replaced completely. This is an ongoing cycle of challenging and validating what you believe you know. When making uncertain decisions, it’s nearly always a mistake not to ask: What are the relevant priors? What might I already know that I can use to better understand the reality of the situation?

Now we need to look at fat-tailed curves: Many of us are familiar with the bell curve, that nice, symmetrical wave that captures the relative frequency of so many things from height to exam scores. The bell curve is great because it’s easy to understand and easy to use. Its technical name is “normal distribution.” If we know we are in a bell curve situation, we can quickly identify our parameters and plan for the most likely outcomes.

Fat-tailed curves are different. Take a look.

(c) 2018 Farnam Street Media Inc. All rights reserved. May not be used without written permission.

At first glance they seem similar enough. Common outcomes cluster together, creating a wave. The difference is in the tails. In a bell curve the extremes are predictable. There can only be so much deviation from the mean. In a fat-tailed curve there is no real cap on extreme events.

The more extreme events that are possible, the longer the tails of the curve get. Any one extreme event is still unlikely, but the sheer number of options means that we can’t rely on the most common outcomes as representing the average. The more extreme events that are possible, the higher the probability that one of them will occur. Crazy things are definitely going to happen, and we have no way of identifying when.

Think of it this way. In a bell curve type of situation, like displaying the distribution of height or weight in a human population, there are outliers on the spectrum of possibility, but the outliers have a fairly well defined scope. You’ll never meet a man who is ten times the size of an average man. But in a curve with fat tails, like wealth, the central tendency does not work the same way. You may regularly meet people who are ten, 100, or 10,000 times wealthier than the average person. That is a very different type of world.

Let’s re-approach the example of the risks of violence we discussed in relation to Bayesian thinking. Suppose you hear that you had a greater risk of slipping on the stairs and cracking your head open than being killed by a terrorist. The statistics, the priors, seem to back it up: 1,000 people slipped on the stairs and died last year in your country and only 500 died of terrorism. Should you be more worried about stairs or terror events?

Some use examples like these to prove that terror risk is low—since the recent past shows very few deaths, why worry?[1] The problem is in the fat tails: The risk of terror violence is more like wealth, while stair-slipping deaths are more like height and weight. In the next ten years, how many events are possible? How fat is the tail?

The important thing is not to sit down and imagine every possible scenario in the tail (by definition, it is impossible) but to deal with fat-tailed domains in the correct way: by positioning ourselves to survive or even benefit from the wildly unpredictable future, by being the only ones thinking correctly and planning for a world we don’t fully understand.

Asymmetries: Finally, you need to think about something we might call “metaprobability” —the probability that your probability estimates themselves are any good.

This massively misunderstood concept has to do with asymmetries. If you look at nicely polished stock pitches made by professional investors, nearly every time an idea is presented, the investor looks their audience in the eye and states they think they’re going to achieve a rate of return of 20% to 40% per annum, if not higher. Yet exceedingly few of them ever attain that mark, and it’s not because they don’t have any winners. It’s because they get so many so wrong. They consistently overestimate their confidence in their probabilistic estimates. (For reference, the general stock market has returned no more than 7% to 8% per annum in the United States over a long period, before fees.)

Another common asymmetry is people’s ability to estimate the effect of traffic on travel time. How often do you leave “on time” and arrive 20% early? Almost never? How often do you leave “on time” and arrive 20% late? All the time? Exactly. Your estimation errors are asymmetric, skewing in a single direction. This is often the case with probabilistic decision-making.[2]

Far more probability estimates are wrong on the “over-optimistic” side than the “under-optimistic” side. You’ll rarely read about an investor who aimed for 25% annual return rates who subsequently earned 40% over a long period of time. You can throw a dart at the Wall Street Journal and hit the names of lots of investors who aim for 25% per annum with each investment and end up closer to 10%.

The spy world

Successful spies are very good at probabilistic thinking. High-stakes survival situations tend to make us evaluate our environment with as little bias as possible.

When Vera Atkins was second in command of the French unit of the Special Operations Executive (SOE), a British intelligence organization reporting directly to Winston Churchill during World War II[3], she had to make hundreds of decisions by figuring out the probable accuracy of inherently unreliable information.

Atkins was responsible for the recruitment and deployment of British agents into occupied France. She had to decide who could do the job, and where the best sources of intelligence were. These were literal life-and-death decisions, and all were based in probabilistic thinking.

First, how do you choose a spy? Not everyone can go undercover in high-stress situations and make the contacts necessary to gather intelligence. The result of failure in France in WWII was not getting fired; it was death. What factors of personality and experience show that a person is right for the job? Even today, with advancements in psychology, interrogation, and polygraphs, it’s still a judgment call.

For Vera Atkins in the 1940s, it was very much a process of assigning weight to the various factors and coming up with a probabilistic assessment of who had a decent chance of success. Who spoke French? Who had the confidence? Who was too tied to family? Who had the problem-solving capabilities? From recruitment to deployment, her development of each spy was a series of continually updated, educated estimates.

Getting an intelligence officer ready to go is only half the battle. Where do you send them? If your information was so great that you knew exactly where to go, you probably wouldn’t need an intelligence mission. Choosing a target is another exercise in probabilistic thinking. You need to evaluate the reliability of the information you have and the networks you have set up. Intelligence is not evidence. There is no chain of command or guarantee of authenticity.

The stuff coming out of German-occupied France was at the level of grainy photographs, handwritten notes that passed through many hands on the way back to HQ, and unverifiable wireless messages sent quickly, sometimes sporadically, and with the operator under incredible stress. When deciding what to use, Atkins had to consider the relevancy, quality, and timeliness of the information she had.

She also had to make decisions based not only on what had happened, but what possibly could. Trying to prepare for every eventuality means that spies would never leave home, but they must somehow prepare for a good deal of the unexpected. After all, their jobs are often executed in highly volatile, dynamic environments. The women and men Atkins sent over to France worked in three primary occupations: organizers were responsible for recruiting locals, developing the network, and identifying sabotage targets; couriers moved information all around the country, connecting people and networks to coordinate activities; and wireless operators had to set up heavy communications equipment, disguise it, get information out of the country, and be ready to move at a moment’s notice. All of these jobs were dangerous. The full scope of the threats was never completely identifiable. There were so many things that could go wrong, so many possibilities for discovery or betrayal, that it was impossible to plan for them all. The average life expectancy in France for one of Atkins’ wireless operators was six weeks.

Finally, the numbers suggest an asymmetry in the estimation of the probability of success of each individual agent. Of the 400 agents that Atkins sent over to France, 100 were captured and killed. This is not meant to pass judgment on her skills or smarts. Probabilistic thinking can only get you in the ballpark. It doesn’t guarantee 100% success.

There is no doubt that Atkins relied heavily on probabilistic thinking to guide her decisions in the challenging quest to disrupt German operations in France during World War II. It is hard to evaluate the success of an espionage career, because it is a job that comes with a lot of loss. Atkins was extremely successful in that her network conducted valuable sabotage to support the allied cause during the war, but the loss of life was significant.

Conclusion

Successfully thinking in shades of probability means roughly identifying what matters, coming up with a sense of the odds, doing a check on our assumptions, and then making a decision. We can act with a higher level of certainty in complex, unpredictable situations. We can never know the future with exact precision. Probabilistic thinking is an extremely useful tool to evaluate how the world will most likely look so that we can effectively strategize.

Members can discuss this post on the Learning Community Forum

References:

[1] Taleb, Nassim Nicholas. Antifragile. New York: Random House, 2012.

[2] Bernstein, Peter L. Against the Gods: The Remarkable Story of Risk. New York: John Wiley and Sons, 1996. (This book includes an excellent discussion in Chapter 13 on the idea of the scope of events in the past as relevant to figuring out the probability of events in the future, drawing on the work of Frank Knight and John Maynard Keynes.)

[3] Helm, Sarah. A Life in Secrets: The Story of Vera Atkins and the Lost Agents of SOE. London: Abacus, 2005.

Zero — Invented or Discovered?

It seems almost a bizarre question. Who thinks about whether zero was invented or discovered? And why is it important?

Answering this question, however, can tell you a lot about yourself and how you see the world.

Let’s break it down.

“Invented” implies that humans created the zero and that without us, the zero and its properties would cease to exist.

“Discovered” means that although the symbol is a human creation, what it represents would exist independently of any human ability to label it.

So do you think of the zero as a purely mathematical function, and by extension think of all math as a human construct like, say, cheese or self-driving cars? Or is math, and the zero, a symbolic language that describes the world, the content of which exists completely independently of our descriptions?

The zero is now a ubiquitous component of our understanding.

The concept is so basic it is routinely mastered by the pre-kindergarten set. Consider the equation 3-3=0. Nothing complicated about that. It is second nature to us that we can represent “nothing” with a symbol. It makes perfect sense now, in 2017, and it’s so common that we forget that zero was a relatively late addition to the number scale.

Here’s a fact that’s amazing to most people: the zero is actually younger than mathematics. Pythagoras’s famous conclusion — that in a right-angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides — was achieved without a zero. As was Euclid’s entire Elements.

How could this be? It seems surreal, given the importance the zero now has to mathematics, computing, language, and life. How could someone figure out the complex geometry of triangles, yet not realize that nothing was also a number?

Tobias Dantzig, in Number: The Language of Science, offers this as a possible explanation: “The concrete mind of the ancient Greeks could not conceive the void as a number, let alone endow the void with a symbol.” This gives us a good direction for finding the answer to the original question because it hints that you must first understand the concept of the void before you can name it. You need to see that nothingness still takes up space.

It was thought, and sometimes still is, that the number zero was invented in the pursuit of ancient commerce. Something was needed as a placeholder; otherwise, 65 would be indistinguishable from 605 or 6050. The zero represents “no units” of the particular place that it holds. So for that last number, we have six thousands, no hundreds, five tens, and no singles.

A happy accident of no great original insight, zero then made its way around the world. In addition to being convenient for keeping track of how many bags of grain you were owed, or how many soldiers were in your army, it turned our number scale into an extremely efficient decimal system. More so than any numbering system that preceded it (and there were many), the zero transformed the power of our other numerals, propelling mathematics into fantastic equations that can explain our world and fuel incredible scientific and technological advances.

But there is, if you look closely, a missing link in this story.

What changed in humanity that made us comfortable with confronting the void and giving it a symbol? And is it reasonable to imagine creating the number without understanding what it represented? Given its properties, can we really think that it started as a placeholder? Or did it contain within it, right from the beginning, the notion of defining the void, of giving it space?

In Finding Zero, Amir Aczel offers some insight. Basically, he claims that the people who discovered the zero must have had an appreciation of the emptiness that it represented. They were labeling a concept with which they were already familiar.

He rediscovered the oldest known zero, on a stone tablet dating from 683 CE in what is now Cambodia.

On his quest to find this zero, Aczel realized that it was far more natural for the zero to first appear in the Far East, rather than in Western or Arab cultures, due to the philosophical and religious understandings prevalent in the region.

Western society was, and still is in many ways, a binary culture. Good and evil. Mind and body. You’re either with us or against us. A patriot or a terrorist. Many of us naturally try to fit our world into these binary understandings. If something is “A,” then it cannot be “not A.” The very definition of “A” is that it is not “not A.” Something cannot be both.

Aczel writes that this duality is not at all reflected in much Eastern thought. He describes the catuskoti, found in early Buddhist logic, that presents four possibilities, instead of two, for any state: that something is, is not, is both, or is neither.

At first, a typical Western mind might rebel against this kind of logic. My father is either bald or not bald. He cannot be both and he cannot be neither, so what is the use of these two other almost nonsensical options?

A closer examination of our language, though, reveals that the expression of the non-binary is understood, and therefore perhaps more relevant than we think. Take, for example, “you’re either with us or against us.” Is it possible to say “I’m both with you and against you”? Yes. It could mean that you are for the principles but against the tactics. Or that you are supportive in contrast to your values. And to say “I’m neither with you nor against you” could mean that you aren’t supportive of the tactic in question, but won’t do anything to stop it. Or that you just don’t care.

Feelings, in particular, are a realm where the binary is often insufficient. Watching my children, I know that it’s possible to be both happy and sad, a traditional binary, at the same time. And the zero itself defies binary categorization. It is something and nothing simultaneously.

Aczel reflects on a conversation he had with a Buddhist monk. “Everything is not everything — there is always something that lies outside of what you may think covers all creation. It could be a thought, or a kind of void, or a divine aspect. Nothing contains everything inside it.”

He goes on to conclude that “Here was the intellectual source of the number zero. It came from Buddhist meditation. Only this deep introspection could equate absolute nothingness with a number that had not existed until the emergence of this idea.”

Which is to say, certain properties of the zero likely were understood conceptually before the symbol came about — nothingness was a thing that could be represented. This idea fits with how we treat the zero today; it may represent nothing, but that nothing still has properties. And investigating those properties demonstrates that there is power in the void — it has something to teach us about how our universe operates.

Further contemplation might illuminate that the zero has something to teach us about existence as well. If we accept zero, the symbol, as being discovered as part of our realization about the existence of nothingness, then trying to understand the zero can teach us a lot about moving beyond the binary of alive/not alive to explore other ways of conceptualizing what it means to be.

In Pursuit of the Unknown: 17 Equations That Changed the World

IN PURSUIT OF THE UNKNOWN

Equations are the lifeblood of mathematics, science, and technology. Without them, our world would not exist in its present form. However, equations have a reputation for being scary: Stephen Hawking’s publishers told him that every equation would halve the sales of A Brief History of Time.

Ignoring the advice, Hawking included E = mc² even ‘when cutting it out would have sold another 10 millions copies.” This captures our aversion to equations well. Yet, mathematician Ian Stewart argues in his book In Pursuit of the Unknown: 17 Equations That Changed the World, “[e]quations are too important to be hidden away.”

Equations are a vital part of this world. And you don’t need to be a rocket scientist to appreciate them.

There are two kinds of equations in mathematics. Stewart writes:

One kind presents relations between various mathematical quantities: the task is to prove the equation is true. The other kind provides information about an unknown quantity, and the mathematician’s task is to solve it – to make the unknown known. The distinction is not clear-cut, because sometimes the same equation can be used in both ways, but it’s a useful guideline.

PT

An example of the first kind of equation is Pythagoras’s theorem, which is “an equation expressed in the language of geometry.” If you accept Euclid’s basic geometric assumptions, then Pythagoras’s theorem must be true.

In the famous translation by Sir Thomas Heath, proposition 47 (Pythagoras’s theorem) of Book I reads:

In right-angled triangles the square on the side subtending the right angle is equal to the squares on the sides containing the right angle.

Many triangles in real life are not right angles. But this does not limit the use of the equation a²+b² = c² because any triangle can be cut into two right-angled ones.

two_rights

So understanding right-angled triangles is the key because “they prove that there is a useful relation between the shape of a triangle and the lengths of its sides.”

A good example of the second kind of equation is Newton’s law of gravity. “It tells us how the attractive force between two bodies depends on their masses,” Stewart writes, “and how far apart they are. Solving the resulting equations tells us how the planets orbit the Sun, or how to design a trajectory for a space probe.” This isn’t a mathematical theorem but rather it’s true for physical reasons in that it fits the observations.

Einstein’s general theory of relativity improves on Newton by fitting some observations better, while not messing up those where we already know Newton’s law does a good job.

Equations, as simple as they appear, have redirected human history time and time again. “An equation derives its power from a simple source,” Stewart writes, “it tells us that two calculations, which appear different, have the same answer.”

The power of equations lies in the philosophically difficult correspondence between mathematics, a collective creation of human minds, and an external physical reality. Equations model deep patterns in the outside world. By learning to value equations, and to read the stories they tell, we can uncover vital features of the world around us. This is the story of the ascent of humanity, told in 17 equations.

Pythagoras’s theorem

pt

The first equation presented in the book.

What does it tell us?
How the three sides of a right-angled triangle are related.
Why is that important?
It provides a vital link between geometry and algebra, allowing us to calculate distances in terms of coordinates. It also inspired trigonometry.
What did it lead to?
Surveying, navigation, and more recently special and general relativity – the best current theories of space, time, and gravity.

History.

The Greeks did not express Pythagoras’s theorem as an equation in the modern symbolic sense. That came later with the development of algebra. In ancient times, the theorem was expressed verbally and geometrically. It attained its most polished form, and its first recorded proof, in the writings of Euclid of Alexandria. Around 250 BC Euclid became the first modern mathematician when he wrote his famous Elements, the most influential mathematical textbook ever. Euclid turned geometry into logic by making his basic assumptions explicit and invoking them to give systematic proofs for all of his theorems. He built a conceptual tower whose foundations were points, lines, and circles, and whose pinnacle was the existence of precisely five regular solids.

For the purposes of higher mathematics, the Greeks worked with lines and areas instead of numbers. So Pythagoras and his Greek successors would decode the theorem as an equality of areas: ‘The area of a square constructed using the longest side of a right-angled triangle is the sum of the areas of the squares formed from the other two sides.’

Maps

Surveying began to take off in 1533 when the Dutch mapmaker Gemma Frisius explained how to use trigonometry to produce accurate maps, in Libellus de Locorum Describendorum Ratione (‘Booklet Concerning a Way of Describing Places’). Word of the method spread across Europe, reaching the ears of the Danish nobleman and astronomer Tycho Brahe. In 1579 Tycho used it to make an accurate map of Hven, the island where his observatory was located. By 1615 the Dutch mathematician Willebrord Snellius (Snel van Royen) had developed the method into essentially its modern form: triangulation. The area being surveyed is covered with a network of triangles. By measuring one initial length very carefully, and many angles, the locations of the corners of the triangle, and hence any interesting features within them, can be calculated. Snellius worked out the distance between two Dutch towns, Alkmaar and Bergen op Zoom, using a network of 33 triangles. He chose these towns because they lay on the same line of longitude and were exactly one degree of arc apart.Knowing the distance between them, he could work out the size of the Earth, which he published in his Eratosthenes Batavus (‘The Dutch Eratosthenes’) in 1617. His result is accurate to within 4%. He also modified the equations of trigonometry to reflect the spherical nature of the Earth’s surface, an important step towards effective navigation.

Triangulation is an indirect way of determining distance by employing angles.

When surveying a stretch of land, be it a building site or a country, the main practical consideration is that it is much easier to measure angles than it is to measure distances. Triangulation lets us measure a few distances and lots of angles; then everything else follows from the trigonometric equations. The method begins by setting out one line between two points, called the baseline, and measuring its length directly to very high accuracy. Then we choose a prominent point in the landscape that is visible from both ends of the baseline, and measure the angle from the baseline to that point, at both ends of the baseline. Now we have a triangle, and we know one side of it and two angles, which fix its shape and size. We can then use trigonometry to work out the other two sides.

In effect, we now have two more baselines: the newly calculated sides of the triangle. From those, we can measure angles to other, more distant points. Continue this process to create a network of triangles that covers the area being surveyed. Within each triangle, observe the angles to all noteworthy features – church towers, crossroads, and so on. The same trigonometric trick pinpoints their precise locations. As a final twist, the accuracy of the entire survey can be checked by measuring one of the final sides directly.

Surveys routinely employed triangulation by the late 18th century. And while we don’t explicitly use it today, it is still there in how we deduce locations from satellite data.

In Pursuit of the Unknown: 17 Equations That Changed the World is an elegant argument for why equations matter.


Farnam Street has a free weekly newsletter every Sunday. It offers the week’s best articles and an assortment of other brain food. Sign up.