Tag: Numeracy

12 Life Lessons From Mathematician and Philosopher Gian-Carlo Rota

The mathematician and philosopher Gian-Carlo Rota spent much of his career at MIT, where students adored him for his engaging, passionate lectures. In 1996, Rota gave a talk entitled “Ten Lessons I Wish I Had Been Taught,” which contains valuable advice for making people pay attention to your ideas.

Many mathematicians regard Rota as single-handedly responsible for turning combinatorics into a significant field of study. He specialized in functional analysis, probability theory, phenomenology, and combinatorics. His 1996 talk, “Ten Lessons I Wish I Had Been Taught,” was later printed in his book, Indiscrete Thoughts.

Rota began by explaining that the advice we give others is always the advice we need to follow most. Seeing as it was too late for him to follow certain lessons, he decided he would share them with the audience. Here, we summarize twelve insights from Rota’s talk—which are fascinating and practical, even if you’re not a mathematician.

***

Every lecture should make only one point

“Every lecture should state one main point and repeat it over and over, like a theme with variations. An audience is like a herd of cows, moving slowly in the direction they are being driven towards.”

When we wish to communicate with people—in an article, an email to a coworker, a presentation, a text to a partner, and so on—it’s often best to stick to making one point at a time. This matters all the more so if we’re trying to get our ideas across to a large audience.

If we make one point well enough, we can be optimistic about people understanding and remembering it. But if we try to fit too much in, “the cows will scatter all over the field. The audience will lose interest and everyone will go back to the thoughts they interrupted in order to come to our lecture.

***

Never run over time

“After fifty minutes (one microcentury as von Neumann used to say), everybody’s attention will turn elsewhere even if we are trying to prove the Riemann hypothesis. One minute over time can destroy the best of lectures.”

Rota considered running over the allotted time slot to be the worst thing a lecturer could do. Our attention spans are finite. After a certain point, we stop taking in new information.

In your work, it’s important to respect the time and attention of others. Put in the extra work required for brevity and clarity. Don’t expect them to find what you have to say as interesting as you do. Condensing and compressing your ideas both ensures you truly understand them and makes them easier for others to remember.

***

Relate to your audience

“As you enter the lecture hall, try to spot someone in the audience whose work you have some familiarity with. Quickly rearrange your presentation so as to manage to mention some of that person’s work.”

Reciprocity is remarkably persuasive. Sometimes, how people respond to your work has as much to do with how you respond to theirs as it does with the work itself. If you want people to pay attention to your work, always give before you take and pay attention to theirs first. Show that you see them and appreciate them. Rota explains that “everyone in the audience has come to listen to your lecture with the secret hope of hearing their work mentioned.

The less acknowledgment someone’s work has received, the more of an impact your attention is likely to have. A small act of encouragement can be enough to deter someone from quitting. With characteristic humor, Rota recounts:

“I have always felt miffed after reading a paper in which I felt I was not being given proper credit, and it is safe to conjecture that the same happens to everyone else. One day I tried an experiment. After writing a rather long paper, I began to draft a thorough bibliography. On the spur of the moment I decided to cite a few papers which had nothing whatsoever to do with the content of my paper to see what might happen.

Somewhat to my surprise, I received letters from two of the authors whose papers I believed were irrelevant to my article. Both letters were written in an emotionally charged tone. Each of the authors warmly congratulated me for being the first to acknowledge their contribution to the field.”

***

Give people something to take home

“I often meet, in airports, in the street, and occasionally in embarrassing situations, MIT alumni who have taken one or more courses from me. Most of the time they admit that they have forgotten the subject of the course and all the mathematics I thought I had taught them. However, they will gladly recall some joke, some anecdote, some quirk, some side remark, or some mistake I made.”

When we have a conversation, read a book, or listen to a talk, the sad fact is that we are unlikely to remember much of it even a few hours later, let alone years after the event. Even if we enjoyed and valued it, only a small part will stick in our memory.

So when you’re communicating with people, try to be conscious about giving them something to take home. Choose a memorable line or idea, create a visual image, or use humor in your work.

For example, in The Righteous Mind, Jonathan Haidt repeats many times that the mind is like a tiny rider on a gigantic elephant. The rider represents controlled mental processes, while the elephant represents automatic ones. It’s a distinctive image, one readers are quite likely to take home with them.

***

Make sure the blackboard is spotless

“By starting with a spotless blackboard, you will subtly convey the impression that the lecture they are about to hear is equally spotless.”

Presentation matters. The way our work looks influences how people perceive it. Taking the time to clean our equivalent of a blackboard signals that we care about what we’re doing and consider it important.

In “How To Spot Bad Science,” we noted that one possible sign of bad science is that the research is presented in a thoughtless, messy way. Most researchers who take their work seriously will put in the extra effort to ensure it’s well presented.

***

Make it easy for people to take notes

“What we write on the blackboard should correspond to what we want an attentive listener to take down in his notebook. It is preferable to write slowly and in a large handwriting, with no abbreviations. Those members of the audience who are taking notes are doing us a favor, and it is up to us to help them with their copying.”

If a lecturer is using slides with writing on them instead of a blackboard, Rota adds that they should give people time to take notes. This might mean repeating themselves in a few different ways so each slide takes longer to explain (which ties in with the idea that every lecture should make only one point). Moving too fast with the expectation that people will look at the slides again later is “wishful thinking.”

When we present our work to people, we should make it simple for them to understand our ideas on the spot. We shouldn’t expect them to revisit it later. They might forget. And even if they don’t, we won’t be there to answer questions, take feedback, and clear up any misunderstandings.

***

Share the same work multiple times

Rota learned this lesson when he bought Collected Papers, a volume compiling the publications of mathematician Frederic Riesz. He noted that “the editors had gone out of their way to publish every little scrap Riesz had ever published.” Putting them all in one place revealed that he had published the same ideas multiple times:

Riesz would publish the first rough version of an idea in some obscure Hungarian journal. A few years later, he would send a series of notes to the French Academy’s Comptes Rendus in which the same material was further elaborated. A few more years would pass, and he would publish the definitive paper, either in French or in English.

Riesz would also develop his ideas while lecturing. Explaining the same subject again and again for years allowed him to keep improving it until he was ready to publish. Rota notes, “No wonder the final version was perfect.

In our work, we might feel as if we need to have fresh ideas all of the time and that anything we share with others needs to be a finished product. But sometimes we can do our best work through an iterative process.

For example, a writer might start by sharing an idea as a tweet. This gets a good response, and the replies help them expand it into a blog post. From there they keep reworking the post over several years, making it longer and more definite each time. They give a talk on the topic. Eventually, it becomes a book.

Award-winning comedian Chris Rock prepares for global tours by performing dozens of times in small venues for a handful of people. Each performance is an experiment to see which jokes land, which ones don’t, and which need tweaking. By the time he’s performed a routine forty or fifty times, making it better and better, he’s ready to share it with huge audiences.

Another reason to share the same work multiple times is that different people will see it each time and understand it in different ways:

“The mathematical community is split into small groups, each one with its own customs, notation, and terminology. It may soon be indispensable to present the same result in several versions, each one accessible to a specific group; the price one might have to pay otherwise is to have our work rediscovered by someone who uses a different language and notation, and who will rightly claim it as his own.”

Sharing your work multiple times thus has two benefits. The first is that the feedback allows you to improve and refine your work. The second is that you increase the chance of your work being definitively associated with you. If the core ideas are strong enough, they’ll shine through even in the initial incomplete versions.

***

You are more likely to be remembered for your expository work

“Allow me to digress with a personal reminiscence. I sometimes publish in a branch of philosophy called phenomenology. . . . It so happens that the fundamental treatises of phenomenology are written in thick, heavy philosophical German. Tradition demands that no examples ever be given of what one is talking about. One day I decided, not without serious misgivings, to publish a paper that was essentially an updating of some paragraphs from a book by Edmund Husserl, with a few examples added. While I was waiting for the worst at the next meeting of the Society for Phenomenology and Existential Philosophy, a prominent phenomenologist rushed towards me with a smile on his face. He was full of praise for my paper, and he strongly encouraged me to further develop the novel and original ideas presented in it.”

Rota realized that many of the mathematicians he admired the most were known more for their work explaining and building upon existing knowledge, as opposed to their entirely original work. Their extensive knowledge of their domain meant they could expand a little beyond their core specialization and synthesize charted territory.

For example, David Hilbert was best known for a textbook on integral equations which was “in large part expository, leaning on the work of Hellinger and several other mathematicians whose names are now forgotten.” William Feller was known for an influential treatise on probability, with few recalling his original work in convex geometry.

One of our core goals at Farnam Street is to share the best of what other people have already figured out. We all want to make original and creative contributions to the world. But the best ideas that are already out there are quite often much more useful than what we can contribute from scratch.

We should never be afraid to stand on the shoulders of giants.

***

Every mathematician has only a few tricks

“. . . mathematicians, even the very best, also rely on a few tricks which they use over and over.”

Upon reading the complete works of certain influential mathematicians, such as David Hilbert, Rota realized that they always used the same tricks again and again.

We don’t need to be amazing at everything to do high-quality work. The smartest and most successful people are often only good at a few things—or even one thing. Their secret is that they maximize those strengths and don’t get distracted. They define their circle of competence and don’t attempt things they’re not good at if there’s any room to double down further on what’s already going well.

It might seem as if this lesson contradicts the previous one (you are more likely to be remembered for your expository work), but there’s a key difference. If you’ve hit diminishing returns with improvements to what’s already inside your circle of competence, it makes sense to experiment with things you already have an aptitude for (or a strong suspicion you might) but you just haven’t made them your focus.

***

Don’t worry about small mistakes

“Once more let me begin with Hilbert. When the Germans were planning to publish Hilbert’s collected papers and to present him with a set on the occasion of one of his later birthdays, they realized that they could not publish the papers in their original versions because they were full of errors, some of them quite serious. Thereupon they hired a young unemployed mathematician, Olga Taussky-Todd, to go over Hilbert’s papers and correct all mistakes. Olga labored for three years; it turned out that all mistakes could be corrected without any major changes in the statement of the theorems. . . . At last, on Hilbert’s birthday, a freshly printed set of Hilbert’s collected papers was presented to the Geheimrat. Hilbert leafed through them carefully and did not notice anything.”

Rota goes on to say: “There are two kinds of mistakes. There are fatal mistakes that destroy a theory; but there are also contingent ones, which are useful in testing the stability of a theory.

Mistakes are either contingent or fatal. Contingent mistakes don’t completely ruin what you’re working on; fatal ones do. Building in a margin of safety (such as having a bit more time or funding that you expect to need) turns many fatal mistakes into contingent ones.

Contingent mistakes can even be useful. When details change, but the underlying theory is still sound, you know which details not to sweat.

***

Use Feynman’s method for solving problems

“Richard Feynman was fond of giving the following advice on how to be a genius. You have to keep a dozen of your favorite problems constantly present in your mind, although by and large they will lay in a dormant state. Every time you hear or read a new trick or a new result, test it against each of your twelve problems to see whether it helps. Every once in a while there will be a hit, and people will say: ‘How did he do it? He must be a genius!’”

***

Write informative introductions

“Nowadays, reading a mathematics paper from top to bottom is a rare event. If we wish our paper to be read, we had better provide our prospective readers with strong motivation to do so. A lengthy introduction, summarizing the history of the subject, giving everybody his due, and perhaps enticingly outlining the content of the paper in a discursive manner, will go some of the way towards getting us a couple of readers.”

As with the lesson of don’t run over time, respect that people have limited time and attention. Introductions are all about explaining what a piece of work is going to be about, what its purpose is, and why someone should be interested in it.

A job posting is an introduction to a company. The description on a calendar invite to a meeting is an introduction to that meeting. An about page is an introduction to an author. The subject line on a cold email is an introduction to that message. A course curriculum is an introduction to a class.

Putting extra effort into our introductions will help other people make an accurate assessment of whether they want to engage with the full thing. It will prime their minds for what to expect and answer some of their questions.

***

If you’re interested in learning more, check out Rota’s “10 Lessons of an MIT Education.

Common Probability Errors to Avoid

If you’re trying to gain a rapid understanding of a new area, one of the most important things you can do is to identify common mistakes people make, then avoid them. Here are some of the most predictable errors we tend to make when thinking about statistics.

Amateurs tend to focus on seeking brilliance. Professionals often know that it’s far more effective to avoid stupidity. Side-stepping typical blunders is the simplest way to get ahead of the crowd.

Gaining a better understanding of probability will give you a more accurate picture of the world and help you make better decisions. However, many people fall prey to the same handful of issues because aspects of probability go against what we think is intuitive. Even if you haven’t studied the topic since high-school, you likely use probability assessments every single day in your work and life.

In Naked Statistics, Charles Wheelan takes the reader on a whistlestop tour of the basics of statistics. In one chapter, he offers pointers for avoiding some of the “most common probability-related errors, misunderstandings, and ethical dilemmas.” Whether you’re somewhat new to the topic or just want a refresher, here’s a summary of Wheelan’s lessons and how you can apply them.

***

Assuming events are independent when they are not

“The probability of flipping heads with a fair coin is 1/2. The probability of flipping two heads in a row is (1/2)^2 or 1/4 since the likelihood of two independent events both happening is the product of their individual probabilities.”

When an event is interconnected with another event, the former happening increases or decreases the probability of the latter happening. Your car insurance gets more expensive after an accident because car accidents are not independent events. A person who gets in one is more likely to get into another in the future. Maybe they’re not such a good driver, maybe they tend to drive after a drink, or maybe their eyesight is imperfect. Whatever the explanation, insurance companies know to revise their risk assessment.

Sometimes though, an event happening might lead to changes that make it less probable in the future. If you spilled coffee on your shirt this morning, you might be less likely to do the same this afternoon because you’ll exercise more caution. If an airline had a crash last year, you may well be safer flying with them because they will have made extensive improvements to their safety procedures to prevent another disaster.

One place we should pay extra attention to the independence or dependence of events is when making plans. Most of our plans don’t go as we’d like. We get delayed, we have to backtrack, we have to make unexpected changes. Sometimes we think we can compensate for a delay in one part of a plan by moving faster later on. But the parts of a plan are not independent. A delay in one area makes delays elsewhere more likely as problems compound and accumulate.

Any time you think about the probability of sequences of events, be sure to identify whether they’re independent or not.

***

Not understanding when events are independent

“A different kind of mistake occurs when events that are independent are not treated as such . . . If you flip a fair coin 1,000,000 times and get 1,000,000 heads in a row, the probability of getting heads on the next flip is still 1/2. The very definition of statistical independence between two events is that the outcome of one has no effect on the outcome of another.”

Imagine you’re grabbing a breakfast sandwich at a local cafe when someone rudely barges into line in front of you and ignores your protestations. Later that day, as you’re waiting your turn to order a latte in a different cafe, the same thing happens: a random stranger pushes in front of you. By the time you go to pick up some pastries for your kids at a different place before heading home that evening, you’re so annoyed by all the rudeness you’ve encountered that you angrily eye every person to enter the shop, on guard for any attempts to take your place. But of course, the two rude strangers were independent events. It’s unlikely they were working together to annoy you. The fact it happened twice in one day doesn’t make it happening a third time more probable.

The most important thing to remember here is that the probability of conjunctive events happening is never higher than the probability of each occurring.

***

Clusters happen

“You’ve likely read the story in the newspaper or perhaps seen the news expose: Some statistically unlikely number of people in a particular area have contracted a rare form of cancer. It must be the water, or the local power plant, or the cell phone tower.

. . . But this cluster of cases may also be the product of pure chance, even when the number of cases appears highly improbable. Yes, the probability that five people in the same school or church or workplace will contract the same rare form of leukemia may be one in a million, but there are millions of schools and churches and workplaces. It’s not highly improbable that five people might get the same rare form of leukemia in one of those places.”

An important lesson of probability is that while particular improbable events are, well, improbable, the chance of any improbable event happening at all is highly probable. Your chances of winning the lottery are almost zero. But someone has to win it. Your chances of getting struck by lightning are almost zero. But with so many people walking around and so many storms, it has to happen to someone sooner or later.

The same is true for clusters of improbable events. The chance of any individual winning the lottery multiple times or getting struck by lightning more than once is even closer to zero than the chance of it happening once. Yet when we look at all the people in the world, it’s certain to happen to someone.

We’re all pattern-matching creatures. We find randomness hard to process and look for meaning in chaotic events. So it’s no surprise that clusters often fool us. If you encounter one, it’s wise to keep in mind the possibility that it’s a product of chance, not anything more meaningful. Sure, it might be jarring to be involved in three car crashes in a year or to run into two college roommates at the same conference. Is it all that improbable that it would happen to someone, though?

***

The prosecutor’s fallacy

“The prosecutor’s fallacy occurs when the context surrounding statistical evidence is neglected . . . the chances of finding a coincidental one in a million match are relatively high if you run the same through a database with samples from a million people.”

It’s important to look at the context surrounding statistics. Let’s say you’re evaluating whether to take a medication your doctor suggests. A quick glance at the information leaflet tells you that it carries a 1 in 10,000 risk of blood clots. Should you be concerned? Well, that depends on context. The 1 in 10,000 figure takes into account the wide spectrum of people with different genes and different lifestyles who might take the medication. If you’re an overweight chain-smoker with a family history of blood clots who takes twelve-hour flights twice a month, you might want to have a more serious discussion with your doctor than an active non-smoker with no relevant family history.

Statistics give us a simple snapshot, but if we want a finer-grained picture, we need to think about context.

***

Reversion to the mean (or regression to the mean)

“Probability tells us that any outlier—an observation that is particularly far from the mean in one direction or the other—is likely to be followed by outcomes that are most consistent with the long-term average.

. . . One way to think about this mean reversion is that performance—both mental and physical—consists of underlying talent-related effort plus an element of luck, good or bad. (Statisticians would call this random error.) In any case, those individuals who perform far above the mean for some stretch are likely to have had luck on their side; those who perform far below the mean are likely to have had bad luck. . . . When a spell of very good luck or very bad luck ends—as it inevitably will—the resulting performance will be closer to the mean.”

Moderate events tend to follow extreme ones. One area that regression to the mean often misleads us is when considering how people perform in areas like sports or management. We may think a single extraordinary success is predictive of future successes. Yet from one result, we can’t know if it’s an outcome of talent or luck—in which case the next result may be average. Failure or success is usually followed by an event closer to the mean, not the other extreme.

Regression to the mean teaches us that the way to differentiate between skill and luck is to look at someone’s track record. The more information you have, the better. Even if past performance is not always predictive of future performance, a track record of consistent high performance is a far better indicator than a single highlight.

***

If you want an accessible tour of basic statistics, check out Naked Statistics by Charles Wheelan.

Regression Toward the Mean: An Introduction with Examples

Regression to the mean is a common statistical phenomenon that can mislead us when we observe the world. Learning to recognize when regression to the mean is at play can help us avoid misinterpreting data and seeing patterns that don’t exist.

***

It is important to minimize instances of bad judgment and address the weak spots in our reasoning. Learning about regression to the mean can help us.

Nobel prize-winning psychologist Daniel Kahneman wrote a book about biases that cloud our reasoning and distort our perception of reality. It turns out there is a whole set of logical errors that we commit because our intuition and brains do not deal well with simple statistics. One of the errors that he examines in Thinking Fast and Slow is the infamous regression toward the mean.

The notion of regression to the mean was first worked out by Sir Francis Galton. The rule goes that, in any series with complex phenomena that are dependent on many variables, where chance is involved, extreme outcomes tend to be followed by more moderate ones.

In Seeking Wisdom, Peter Bevelin offers the example of John, who was dissatisfied with the performance of new employees so he put them into a skill-enhancing program where he measured the employees’ skill:

Their scores are now higher than they were on the first test. John’s conclusion: “The skill-enhancing program caused the improvement in skill.” This isn’t necessarily true. Their higher scores could be the result of regression to the mean. Since these individuals were measured as being on the low end of the scale of skill, they would have shown an improvement even if they hadn’t taken the skill-enhancing program. And there could be many reasons for their earlier performance — stress, fatigue, sickness, distraction, etc. Their true ability perhaps hasn’t changed.

Our performance always varies around some average true performance. Extreme performance tends to get less extreme the next time. Why? Testing measurements can never be exact. All measurements are made up of one true part and one random error part. When the measurements are extreme, they are likely to be partly caused by chance. Chance is likely to contribute less on the second time we measure performance.

If we switch from one way of doing something to another merely because we are unsuccessful, it’s very likely that we do better the next time even if the new way of doing something is equal or worse.

This is one of the reasons it’s dangerous to extrapolate from small sample sizes, as the data might not be representative of the distribution. It’s also why James March argues that the longer someone stays in their job, “the less the probable difference between the observed record of performance and actual ability.” Anything can happen in the short run, especially in any effort that involves a combination of skill and luck. (The ratio of skill to luck also impacts regression to the mean.)

“Regression to the mean is not a natural law. Merely a statistical tendency. And it may take a long time before it happens.”

— Peter Bevelin

Regression to the Mean

The effects of regression to the mean can frequently be observed in sports, where the effect causes plenty of unjustified speculations.

In Thinking Fast and Slow, Kahneman recalls watching men’s ski jump, a discipline where the final score is a combination of two separate jumps. Aware of the regression to the mean, Kahneman was startled to hear the commentator’s predictions about the second jump. He writes:

Norway had a great first jump; he will be tense, hoping to protect his lead and will probably do worse” or “Sweden had a bad first jump and now he knows he has nothing to lose and will be relaxed, which should help him do better.

Kahneman points out that the commentator had noticed the regression to the mean and come up with a story for which there was no causal evidence (see narrative fallacy). This is not to say that his story could not be true. Maybe, if we measured the heart rates before each jump, we would see that they are more relaxed if the first jump was bad. However, that’s not the point. The point is, regression to the mean happens when luck plays a role, as it did in the outcome of the first jump.

The lesson from sports applies to any activity where chance plays a role. We often attach explanations of our influence over a particular process to the progress or lack of it.

In reality, the science of performance is complex, situation dependent and often much of what we think is within our control is truly random.

In the case of ski jumps, a strong wind against the jumper will lead to even the best athlete showing mediocre results. Similarly, a strong wind and ski conditions in favor of a mediocre jumper may lead to a considerable, but a temporary bump in his results. These effects, however, will disappear once the conditions change and the results will regress back to normal.

This can have serious implications for coaching and performance tracking. The rules of regression suggest that when evaluating performance or hiring, we must rely on track records more than outcomes of specific situations. Otherwise, we are prone to be disappointed.

When Kahneman was giving a lecture to Israeli Air Force about the psychology of effective training, one of the officers shared his experience that extending praise to his subordinates led to worse performance, whereas scolding led to an improvement in subsequent efforts. As a consequence, he had grown to be generous with negative feedback and had become rather wary of giving too much praise.

Kahneman immediately spotted that it was regression to the mean at work. He illustrated the misconception by a simple exercise you may want to try yourself. He drew a circle on a blackboard and then asked the officers one by one to throw a piece of chalk at the center of the circle with their backs facing the blackboard. He then repeated the experiment and recorded each officer’s performance in the first and second trial.

Naturally, those that did incredibly well on the first try tended to do worse on their second try and vice versa. The fallacy immediately became clear: the change in performance occurs naturally. That again is not to say that feedback does not matter at all – maybe it does, but the officer had no evidence to conclude it did.

The Imperfect Correlation and Chance

At this point, you might be wondering why the regression to the mean happens and how we can make sure we are aware of it when it occurs.

In order to understand regression to the mean, we must first understand correlation.

The correlation coefficient between two measures which varies between -1 and 1, is a measure of the relative weight of the factors they share. For example, two phenomena with few factors shared, such as bottled water consumption versus suicide rate, should have a correlation coefficient of close to 0. That is to say, if we looked at all countries in the world and plotted suicide rates of a specific year against per capita consumption of bottled water, the plot would show no pattern at all.

no correlation
No Correlation

On the contrary, there are measures which are solely dependent on the same factor. A good example of this is temperature. The only factor determining temperature – velocity of molecules — is shared by all scales, hence each degree in Celsius will have exactly one corresponding value in Fahrenheit. Therefore temperature in Celsius and Fahrenheit will have a correlation coefficient of 1 and the plot will be a straight line.

Perfect Correlation
Perfect Correlation

There are few if any phenomena in human sciences that have a correlation coefficient of 1. There are, however, plenty where the association is weak to moderate and there is some explanatory power between the two phenomena. Consider the correlation between height and weight, which would land somewhere between 0 and 1. While virtually every three-year-old will be lighter and shorter than every grown man, not all grown men or three-year-olds of the same height will weigh the same.

Weak Correlation
Weak to Moderate Correlation

This variation and the corresponding lower degree of correlation implies that, while height is generally speaking a good predictor, there clearly are factors other than the height at play. When the correlation of two measures is less than perfect, we must watch out for the effects of regression to the mean.

Kahneman observed a general rule: Whenever the correlation between two scores is imperfect, there will be regression to the mean.

This at first might seem confusing and not very intuitive, but the degree of regression to the mean is directly related to the degree of correlation of the variables. This effect can be illustrated with a simple example.

Assume you are at a party and ask why it is that highly intelligent women tend to marry men who are less intelligent than they are. Most people, even those with some training in statistics, will quickly jump in with a variety of causal explanations ranging from avoidance of competition to the fears of loneliness that these females face. A topic of such controversy is likely to stir up a great debate.

Now, what if we asked why the correlation between the intelligence scores of spouses is less than perfect? This question is hardly as interesting and there is little to guess – we all know this to be true. The paradox lies in the fact that the two questions happen to be algebraically equivalent. Kahneman explains:

[…] If the correlation between the intelligence of spouses is less than perfect (and if men and women on average do not differ in intelligence), then it is a mathematical inevitability that highly intelligent women will be married to husbands who are on average less intelligent than they are (and vice versa, of course). The observed regression to the mean cannot be more interesting or more explainable than the imperfect correlation.

Assuming that correlation is imperfect, the chances of two partners representing the top 1% in terms of any characteristic is far smaller than one partner representing the top 1% and the other – the bottom 99%.

The Cause, Effect, and Treatment

We should be especially wary of the regression to the mean phenomenon when trying to establish causality between two factors. Whenever correlation is imperfect, the best will always appear to get worse and the worst will appear to get better over time, regardless of any additional treatment. This is something that the general media and sometimes even trained scientists fail to recognize.

Consider the example Kahneman gives:

Depressed children treated with an energy drink improve significantly over a three-month period. I made up this newspaper headline, but the fact it reports is true: if you treated a group of depressed children for some time with an energy drink, they would show a clinically significant improvement. It is also the case that depressed children who spend some time standing on their head or hug a cat for twenty minutes a day will also show improvement.

Whenever coming across such headlines it is very tempting to jump to the conclusion that energy drinks, standing on the head or hugging cats are all perfectly viable cures for depression. These cases, however, once again embody the regression to the mean:

Depressed children are an extreme group, they are more depressed than most other children—and extreme groups regress to the mean over time. The correlation between depression scores on successive occasions of testing is less than perfect, so there will be regression to the mean: depressed children will get somewhat better over time even if they hug no cats and drink no Red Bull.

We often mistakenly attribute a specific policy or treatment as the cause of an effect, when the change in the extreme groups would have happened anyway. This presents a fundamental problem: how can we know if the effects are real or simply due to variability?

Luckily there is a way to tell between a real improvement and regression to the mean. That is the introduction of the so-called control group, which is expected to improve by regression alone. The aim of the research is to determine whether the treated group improve more than regression can explain.

In real life situations with the performance of specific individuals or teams, where the only real benchmark is the past performance and no control group can be introduced, the effects of regression can be difficult if not impossible to disentangle. We can compare against industry average, peers in the cohort group or historical rates of improvement, but none of these are perfect measures.

***

Luckily awareness of the regression to the mean phenomenon itself is already a great first step towards a more careful approach to understanding luck and performance.

If there is anything to be learned from the regression to the mean it is the importance of track records rather than relying on one-time success stories. I hope that the next time you come across an extreme quality in part governed by chance you will realize that the effects are likely to regress over time and will adjust your expectations accordingly.

What to Read Next

Edward Frenkel: Love and Math —The Heart of Hidden Reality

“The laws of Nature are written in the language of mathematics.”
Galileo

***

Most of us are unaware of the hidden world of mathematics. Actually, we’d rather avoid the subject entirely. It’s difficult and inaccessible.

A lot of that has to do with the way we’re introduced to mathematics as taught in school and university.

Math, however, can be “full of infinite possibilities as well as elegance and beauty,” writes mathematician Edward Frenkel in Love and Math: The Heart of Hidden Reality. “Mathematics,” he goes on, “is as much part of our cultural heritage as art, literature, and music.”

Mathematics directs the flow of the universe, lurks behind its shapes and curves, holds the reins of everything from tiny atoms to the biggest stars.

Frenkel, who became a professor at Harvard at twenty-one, now teaches at Berkeley. He “hated math” when he was in school. “What really excited me was physics—especially quantum physics.”

A reader sent me a pointer to Frenkel’s book after reading 17 equations that changed the world. And I’m glad they did.

Math is a way to describe reality and figure out how the world works, a universal language that has become the gold standard of truth. In our world, increasingly driven by science and technology, mathematics is becoming, ever more, the source of power, wealth, and progress.

Frenkel argues that mathematical knowledge can be an equalizer.

Mathematical knowledge is unlike any other knowledge. While our perception of the physical world can always be distorted, our perception of mathematical truths can’t be. They are objective, persistent, necessary truths. A mathematical formula or theorem means the same thing to anyone anywhere – no matter what gender, religion, or skin color; it will mean the same thing to anyone a thousand years from now. And what’s also amazing is that we own all of them. No one can patent a mathematical formula, it’s ours to share. There is nothing in this world that is so deep and exquisite and yet so readily available to all. That such a reservoir of knowledge really exists is nearly unbelievable. It’s too precious to be given away to the “initiated few.” It belongs to all of us.

One of the key functions of mathematics is the ordering of information.

This is what distinguishes the brush strokes of Van Gogh from a mere blob of paint. With the advent of 3D printing, the reality we are used to is undergoing a radical transformation: everything is migrating from the sphere of physical objects to the sphere of information and data. We will soon be able to convert information into matter on demand by using 3D printers just as easily as we now convert a PDF file into a book or an MP3 file into a piece of music.

In our information expanding world, the role of mathematics will become even more crucial as a means to organize and order information. (As equations take over, we need to be mindful of what is being filtered.)

Frenkel beautifully explains our cultural aversion to math.

What if at school you had to take an “art class” in which you were only taught how to paint a fence? What if you were never shown the paintings of Leonardo da Vinci and Picasso? Would that make you appreciate art? Would you want to learn more about it? I doubt it. You would probably say something like this: “Learning art at school was a waste of my time. If I ever need to have my fence painted, I’ll just hire people to do this for me.” Of course, this sounds ridiculous, but this is how math is taught, and so in the eyes of most of us it becomes the equivalent of watching paint dry. While the paintings of the great masters are readily available, the math of the great masters is locked away.

You can appreciate math without studying it.

[M]ost of us have heard of and have at least a rudimentary understanding of such concepts as the solar system, atoms and elementary particles, the double helix of DNA, and much more, without taking courses in physics and biology. And nobody is surprised that these sophisticated ideas are part of our culture, our collective consciousness. Likewise, everybody can grasp key mathematical concepts and ideas, if they are explained in the right way. To do this, it is not necessary to study math for years; in many cases, we can cut right to the point and jump over tedious steps.

The problem is: while the world at large is always talking about planets, atoms, and DNA, chances are no one has ever talked to you about the fascinating ideas of modern math, such as symmetry groups, novel numerical systems in which 2 and 2 isn’t always 4, and beautiful geometric shapes like Riemann surfaces. It’s like they keep showing you a little cat and telling you that this is what a tiger looks like.

***

“People think they don’t understand math, but it’s all about how you explain it to them.
If you ask a drunkard what number is larger, 2/ 3 or 3/ 5, he won’t be able to tell you.
But if you rephrase the question: what is better,
2 bottles of vodka for 3 people or 3 bottles of vodka for 5 people,
he will tell you right away: 2 bottles for 3 people, of course.”

Israel Gelfand

***

Perhaps offering some prescient advice to coming generations, Charles Darwin, wrote in his autobiography:

“I have deeply regretted that I did not proceed far enough at least to understand something of the great leading principles of mathematics, for men thus endowed seem to have an extra sense.”

“Mathematics is the source of timeless profound knowledge,” Frenkel writes, “which goes to the heart of all matter and unites us across cultures, continents, and centuries.”

My dream is that all of us will be able to see, appreciate, and marvel at the magic beauty and exquisite harmony of these ideas, formulas, and equations, for this will give so much more meaning to our love for this world and for each other.

Love and Math is a book about mathematical love. Frenkel offers the reader a glimpse into the beauty of mathematics with the Langlands Program, “one of the biggest ideas to come out of mathematics in the last fifty years.” In so doing, he exposes us to the sides of math we don’t get to see often: inspiration, profound ideas, and beautiful revelations.

Benoit Mandelbrot — The Fractalist: Memoir of a Scientific Maverick

“I have never done anything like others,” Benoit Mandelbrot (1924-2010) once said.

That statement is proven time and time again in his autobiography: The Fractalist.

Mandelbrot is independent almost to a fault, his book an interesting memoir from the man who revitalized visual geometry, and whose ideas about fractals have changed how we look at physics, engineering, arts, medicine, finance, and biology.

Nearly all common patterns in nature are rough. They have aspects that are exquisitely irregular and fragmented—not merely more elaborate than the marvelous ancient geometry of Euclid but of massively greater complexity. For centuries, the very idea of measuring roughness was an idle dream. This is one of the dreams to which I have devoted my entire scientific life.

Let me introduce myself. A scientific warrior of sorts, and an old man now, I have written a great deal but never acquired a predictable audience. So, in this memoir, please allow me to tell you who I think I am and how I came to labor for so many years on the first-ever theory of roughness and was rewarded by watching it transform itself into an aspect of a theory of beauty.

Mandelbrot was full of insight.

What shape is a mountain, a coastline, a river or a dividing line between two river watersheds? … Clouds are not spheres, mountains are not cones, coastlines are not circles, and bark is not smooth, nor does lightning travel in a straight line.

While that sounds obvious it wasn’t at the time. In showing that triangles, squares, and circles are more prevalent in textbooks than reality, he brought to life the discipline now known as fractal geometry, a general theory of “roughness.”

Mandelbrot was fascinating, in part, because he never stayed in one place very long.

An acquaintance of mine was a forceful dean at a major university. One day, as our paths crossed in a busy corridor, he stopped to make a comment I never forgot: “You are doing very well, yet you are taking a lonely and hard path. You keep running from field to field, leading an unpredictable life, never settling down to enjoy what you have accomplished. A rolling stone gathers no moss, and—behind your back—people call you completely crazy. But I don’t think you are crazy at all, and you must continue what you are doing. For a thinking person, the most serious mental illness is not being sure of who you are. This is a problem you do not suffer from. You never need to reinvent yourself to fit changes in circumstances; you just move on. In that respect, you are the sanest person among us.”

Quietly, I responded that I was not running from field to field, but rather working on a theory of roughness. I was not a man with a big hammer to whom every problem looked like a nail. Were his words meant to compliment or merely to reassure? I soon found out: he was promoting me for a major award.

Is mental health compatible with being possessed by barely contained restlessness? In Dante’s Divine Comedy, the deceased sentenced to eternal searching are pushed to the deepest level of the Inferno. But for me, an eternal search across countless scientific fields beyond obvious connection managed to add up to a happy life. A rolling stone perhaps, but not an unresponsive one. Overactive and self-motivated, I loved to roll along, stopping to listen and preach in lay monasteries of all kinds—some splendid and proud, others forsaken and out of the way.

He had a different way of looking at things. For example, he saw math problems as geometry.

I would raise my hand and describe my findings: “Monsieur, I see an obvious geometric solution.” I quickly grasped the most abstract problem that the teacher could contrive. And then — with no effort, conscious search, or delay — I continued along a path that somehow avoided every difficulty…. I managed to be examined on the basis of speed and good taste in, first, translating algebra back into geometry, and then thinking in terms of geometric shapes. My analytic skills remained so-so, but that did not matter — the hard work was done by geometry, and it sufficed to fill in short calculations that even I could manage.

Ultimately, The Fractalist is proof that “force of character and independence” can take some to great heights.