Tag: Stephen Hawking

Gates’ Law: How Progress Compounds and Why It Matters

“Most people overestimate what they can achieve in a year and underestimate what they can achieve in ten years.”

It’s unclear exactly who first made that statement, when they said it, or how it was phrased. The most probable source is Roy Amara, a Stanford computer scientist. In the 1960s, Amara told colleagues that he believed that “we overestimate the impact of technology in the short-term and underestimate the effect in the long run.” For this reason, variations on that phrase are often known as Amara’s Law. However, Bill Gates made a similar statement (possibly paraphrasing Amara), so it’s also known as Gates’s Law.

You may have seen the same phrase attributed to Arthur C. Clarke, Tony Robbins, or Peter Drucker. There’s a good reason why Amara’s words have been appropriated by so many thinkers—they apply to so much more than technology. Almost universally, we tend to overestimate what can happen in the short term and underestimate what can happen in the long term.

Thinking about the future does not require endless hyperbole or even forecasting, which is usually pointless anyway. Instead, there are patterns we can identify if we take a long-term perspective.

Let’s look at what Bill Gates meant and why it matters.

Moore’s Law

Gates’s Law is often mentioned in conjunction with Moore’s Law. This is generally quoted as some variant of “the number of transistors on an inch of silicon doubles every eighteen months.” However, calling it Moore’s Law is misleading—at least if you think of laws as invariant. It’s more of an observation of a historical trend.

When Gordon Moore, co-founder of Fairchild Semiconductor and Intel, noticed in 1965 that the number of semiconductors on a chip doubled every year, he was not predicting that would continue in perpetuity. Indeed, Moore revised the doubling time to two years a decade later. But the world latched onto his words. Moore’s Law has been variously treated as a target, a limit, a self-fulfilling prophecy, and a physical law as certain as the laws of thermodynamics.

Moore’s Law is now considered to be outdated, after holding true for several decades. That doesn’t mean the concept has gone anywhere. Moore’s Law is often regarded as a general principle in technological development. Certain performance metrics have a defined doubling time, the opposite of a half-life.

Why is Moore’s Law related to Amara’s Law?

Exponential growth is a concept we struggle to conceptualize. As University of Colorado physics professor Albert Allen Bartlett famously put it, “The greatest shortcoming of the human race is our inability to understand the exponential function.”

When we talk about Moore’s Law, we easily underestimate what happens when a value keeps doubling. Sure, it’s not that hard to imagine your laptop getting twice as fast in a year, for instance. Where it gets tricky is when we try to imagine what that means on a longer timescale. What does that mean for your laptop in 10 years? There is a reason your iPhone has more processing power than the first space shuttle.

One of the best illustrations of exponential growth is the legend about a peasant and the emperor of China. In the story, the peasant (sometimes said to be the inventor of chess), visits the emperor with a seemingly modest request: a chessboard with one grain of rice on the first square, then two on the second, four on the third and so on, doubling each time. The emperor agreed to this idiosyncratic request and ordered his men to start counting out rice grains.

“Every fact of science was once damned. Every invention was considered impossible. Every discovery was a nervous shock to some orthodoxy. Every artistic innovation was denounced as fraud and folly. We would own no more, know no more, and be no more than the first apelike hominids if it were not for the rebellious, the recalcitrant, and the intransigent.”

— Robert Anton Wilson

If you haven’t heard this story before, it might seem like the peasant would end up with, at best, enough rice to feed their family that evening. In reality, the request was impossible to fulfill. Doubling one grain 63 times (the number of squares on a chessboard, minus the first one that only held one grain) would mean the emperor had to give the peasant over 18 million trillion grains of rice. To grow just half of that amount, he would have needed to drain the oceans and convert every bit of land on this planet into rice fields. And that’s for half.

In his essay “The Law of Accelerating Returns,” author and inventor Ray Kurzweil uses this story to show how we misunderstand the meaning of exponential growth in technology. For the first few squares, the growth was inconsequential, especially in the eyes of an emperor. It was only once they reached the halfway point that the rate began to snowball dramatically. (It’s no coincidence that Warren Buffett’s authorized biography is called The Snowball, and few people understand exponential growth better than Warren Buffett). It just so happens that by Kurzweil’s estimation, we’re at that inflection point in computing. Since the creation of the first computers, computation power has doubled roughly 32 times. We may underestimate the long-term impact because the idea of this continued doubling is so tricky to imagine.

The Technology Hype Cycle

To understand how this plays out, let’s take a look at the cycle innovations go through after their invention. Known as the Gartner hype cycle, it primarily concerns our perception of technology—not its actual value in our lives.

Hype cycles are obvious in hindsight, but fiendishly difficult to spot while they are happening. It’s important to bear in mind that this model is one way of looking at reality and is not a prediction or a template. Sometimes a step gets missed, sometimes there is a substantial gap between steps, sometimes a step is deceptive.

The hype cycle happens like this:

  • New technology: The media picks up on the existence of a new technology which may not exist in a usable form yet. Nonetheless, the publicity leads to significant interest. At this point, people working on research and development are probably not making any money from it. Lots of mistakes are made. In Everett Rogers’s diffusion of innovations theory, this is known as the innovation stage. If it seems like something new will have a dramatic payoff, it probably won’t last. If it seems we have found the perfect use for a brand-new technology, we may be wrong.
  • The peak of inflated expectations: A few well-publicized success stories lead to inflated expectations. Hype builds and new companies pop up to anticipate the demand. There may be a burst of funding for research and development. Scammers looking to make a quick buck may move into the area. Rogers calls this the syndication stage. It’s here that we overestimate the future applications and impact of the technology.
  • The trough of disillusionment: Prominent failures or a lack of progress break through the hype and lead to disillusionment. People become pessimistic about technology’s potential and mostly lose interest. Reports of scams may contribute to this, as the media uses this as a reason to describe the technology as a fraud. If it seems like new technology is dying, it may just be that its public perception has changed and the technology itself is still developing. Hype does not correlate directly with functionality.
  • The slope of enlightenment: As time passes, people continue to improve technology and find better uses for it. Eventually, it’s clear how it can improve our lives, and mainstream adoption begins. Mechanisms for preventing scams or lawbreaking emerge.
  • The plateau of productivity: The technology becomes mainstream. Development slows. It becomes part of our lives and ceases to seem novel. Those who move into the now saturated market tend to struggle, as a few dominant players take the lion’s share of the available profits. Rogers calls this the diffusion stage.

When we are cresting the peak of inflated expectations, we imagine that the new development will transform our lives within months. In the depths of the trough of disillusionment, we don’t expect it to get anywhere, even allowing years for it to improve. We typically fail to anticipate the significance of the plateau of productivity, even if it exceeds our initial expectations.

Smart people can usually see through the initial hype. But only a handful of people can—through foresight, stubbornness or perhaps pure luck—see through the trough of disillusionment. Most of the initial skeptics feel vindicated by the dramatic drop in interest and expect the innovation to disappear. It takes far greater expertise to support an unpopular technology than to deride a popular one.

Correctly spotting the cycle as it unfolds can be immensely profitable. Misreading it can be devastating. First movers in a new area often struggle to survive the trough, even if they are the ones who do the essential research and development. We tend to assume current trends will continue, so we expect sustained growth during the peak and expect linear decline during the trough.

If we are trying to assess the future impact of a new technology, we need to separate its true value from its public perception. When something is new, the mainstream hype is likely to be more noise than signal. After all, the peak of inflated expectations often happens before the technology is available in a usable form. It’s almost always before the public has access to it. Hype serves a real purpose in the early days: it draws interest, secures funding, attracts people with the right talents to move things forward and generates new ideas. Not all hype is equally important, because not all opinions are equally important. If there’s intense interest within a niche group with relevant expertise, that’s more telling than a general enthusiasm.

The hype cycle doesn’t just happen with technology. It plays out all over the place, and we’re usually fooled by it. Discrepancies between our short- and long-term estimates of achievement are everywhere. Consider the following situations. They’re hypothetical, but similar situations are common.

  • A musician releases an acclaimed debut album which creates enormous interest in their work. When their second album proves disappointing (or never materializes), most people lose interest. Over time, the performer develops a loyal, sustained following of people who accurately assess the merits of their music, not the hype.
  • A promising new pharmaceutical receives considerable attention—until it becomes apparent that there are unexpected side effects, or it isn’t as powerful as expected. With time, clinical trials find alternate uses which may prove even more beneficial. For example, a side effect could be helpful for another use. It’s estimated that over 20% of pharmaceuticals are prescribed for a different purpose than they were initially approved for, with that figure rising as high as 60% in some areas.
  • A propitious start-up receives an inflated valuation after a run of positive media attention. Its founders are lauded and extensively profiled and investors race to get involved. Then there’s an obvious failure—perhaps due to the overconfidence caused by hype—or early products fall flat or take too long to create. Interest wanes. The media gleefully dissects the company’s apparent demise. But the product continues to improve and ultimately becomes a part of our everyday lives.

In the short run, the world is a voting machine affected by whims and marketing. In the long run, it’s a weighing machine where quality and product matter.

The Adjacent Possible

Now that we know how Amara’s Law plays out in real life, the next question is: why does this happen? Why does technology grow in complexity at an exponential rate? And why don’t we see it coming?

One explanation is what Stuart Kauffman describes as “the adjacent possible.” Each new innovation adds to the number of achievable possible (future) innovations. It opens up adjacent possibilities which didn’t exist before, because better tools can be used to make even better tools.

Humanity is about expanding the realm of the possible. Discovering fire meant our ancestors could use the heat to soften or harden materials and make better tools. Inventing the wheel meant the ability to move resources around, which meant new possibilities such as the construction of more advanced buildings using materials from other areas. Domesticating animals meant a way to pull wheeled vehicles with less effort, meaning heavier loads, greater distances and more advanced construction. The invention of writing led to new ways of recording, sharing and developing knowledge which could then foster further innovation. The internet continues to give us countless new opportunities for innovation. Anyone with a new idea can access endless free information, find supporters, discuss their ideas and obtain resources. New doors to the adjacent possible open every day as we find different uses for technology.

“We like to think of our ideas as $40,000 incubators shipped directly from the factory, but in reality, they’ve been cobbled together with spare parts that happened to be sitting in the garage.”

— Steven Johnson, Where Good Ideas Come From

Take the case of GPS, an invention that was itself built out of the debris of its predecessors. In recent years, GPS has opened up new possibilities that didn’t exist before. The system was developed by the US government for military usage. In the 1980s, they decided to start allowing other organizations and individuals to use it. Civilian access to GPS gave us new options. Since then, it has led to numerous innovations that incorporate the system into old ideas: self-driving cars, mobile phone tracking (very useful for solving crime or finding people in emergency situations), tectonic plate trackers that help predict earthquakes, personal navigation systems, self-navigating robots, and many others. None of these would have been possible without some sort of global positioning system. With the invention of GPS, human innovation sped up a little more.

Steven Johnson gives one example of how this happens in Where Good Ideas Come From. In 2008, MIT professor Timothy Presto visited a hospital in Indonesia and found that all eight of the incubators for newborn babies were broken. The incubators had been donated to the hospital by relief organizations, but the staff didn’t know how to fix them. Plus, the incubators were poorly suited to the humid climate and the repair instructions only came in English. Presto realized that donating medical equipment was pointless if local people couldn’t fix it. He and his team began working on designing an incubator that could save the lives of babies for a lot longer than a couple of months.

Instead of continuing to tweak existing designs, Presto and his team devised a completely new incubator that used car parts. While the local people didn’t know how to fix an incubator, they were extremely adept at keeping their cars working no matter what. Named the NeoNurture, it used headlights for warmth, dashboard fans for ventilation, and a motorcycle battery for power. Hospital staff just needed to find someone who was good with cars to fix it—the principles were the same.

Even more, telling is the origin of the incubators Presto and his team reconceptualized. The first incubator for newborn babies was designed by Stephane Tarnier in the late 19th century. While visiting a zoo on his day off, Tarnier noted that newborn chicks were kept in heated boxes. It’s not a big leap to imagine that the issue of infant mortality was permanently on his mind. Tarnier was an obstetrician, working at a time when the infant mortality rate for premature babies was about 66%. He must have been eager to try anything that could reduce that figure and its emotional toll. Tarnier’s rudimentary incubator immediately halved that mortality rate. The technology was right there, in the zoo. It just took someone to connect the dots and realize human babies aren’t that different from chicken babies.

Johnson explains the significance of this: “Good ideas are like the NeoNurture device. They are, inevitably, constrained by the parts and skills that surround them…ideas are works of bricolage; they’re built out of that detritus.” Tarnier could invent the incubator only because someone else had already invented a similar device. Presto and his team could only invent the NeoNurture because Tarnier had come up with the incubator in the first place.

This happens in our lives, as well. If you learn a new skill, the number of skills you could potentially learn increases because some elements may be transferable. If you are introduced to a new person, the number of people you could meet grows, because they may introduce you to others. If you start learning a language, native speakers may be more willing to have conversations with you in it, meaning you can get a broader understanding. If you read a new book, you may find it easier to read other books by linking together the information in them. The list is endless. We can’t imagine what we’re capable of achieving in ten years because we forget about the adjacent possibilities that will emerge.

Accelerating Change

The adjacent possible has been expanding ever since the first person picked up a stone and started shaping it into a tool. Just look at what written and oral forms of communication made possible—no longer did each generation have to learn everything from scratch. Suddenly we could build upon what had come before us.

Some (annoying) people claim that there’s nothing new left. There are no new ideas to be had, no new creations to invent, no new options to explore. In fact, the opposite is true. Innovation is a non-zero-sum game. A crowded market actually means more opportunities to create something new than a barren one. Technology is a feedback loop. The creation of something new begets the creation of something even newer and so on.

Progress is exponential, not linear. So we overestimate the impact of a new technology during the early days when it is just finding its feet, then underestimate its impact in a decade or so when its full uses are emerging. As old limits and constraints melt away, our options explode. The exponential growth of technology is known as accelerating change. It’s a common belief among experts that the rate of change is speeding up and society will change dramatically alongside it.

“Ideas borrow, blend, subvert, develop and bounce off other ideas.”

— John Hegarty, Hegarty On Creativity

In 1999, author and inventor Ray Kurzweil posited the Law of Accelerating Change — that evolutionary systems develop at an exponential rate. While this is most obvious for technology, Kurzweil hypothesized that the principle is relevant in numerous other areas. Moore’s Law, initially referring only to semiconductors, has wider implications.

In an essay on the topic, he writes:

An analysis of the history of technology shows that technological change is exponential, contrary to the common-sense “intuitive linear” view. So we won’t experience 100 years of progress in the 21st century—it will be more like 20,000 years of progress (at today’s rate). The “returns,” such as chip speed and cost-effectiveness, also increase exponentially. There’s even exponential growth in the rate of exponential growth.

Progress is tricky to predict or even to notice as it happens. It’s hard to notice things in a system that we are part of. And it’s hard to notice the incremental change because it lacks stark contrast. The current pace of change is our norm, and we adjust to it. In hindsight, we can see how Amara’s Law plays out.

Look at where the internet was just twenty years ago. A report from the Pew Research Center shows us how to change compounds. In 1998, a mere 41% of Americans used the internet at all—and the report expresses surprise that the users were beginning to include “people without college training, those with modest incomes, and women.” Less than a third of users had bought something online, email was predominantly just for work, and only a third of users looked at online news at least once per week. That’s a third of the 41% using the internet by the way, not of the general population. Wikipedia and Gmail didn’t exist. Internet users in the late nineties reported that their main problem was finding what they needed online.

That is perhaps the biggest change and one we may not have anticipated: the move towards personalization. Finding what we need is no longer a problem. Most of us have the opposite problem and struggle with information overwhelm. Twenty years ago, filter bubbles were barely a problem (at least, not online.) Now, almost everything we encounter online is personalized to ensure it’s ridiculously easy to find what we want. Newsletters, websites, and apps greet us by name. Newsfeeds are organized by our interests. Shopping sites recommend other products we might like. This has increased the amount the internet does for us to a level that would have been hard to imagine in the late 90s. Kevin Kelly, writing in The Inevitable,  describes filtering as one of the key forces that will shape the future.

History reveals an extraordinary acceleration of technological progress. Establishing the precise history of technology is problematic as some inventions occurred in several places at varying times, archaeological records are inevitably incomplete, and dating methods are imperfect. However, accelerating change is a clear pattern. To truly understand the principle of accelerating change, we need to take a quick look at a simple overview of the history of technology.

Early innovations happened slowly. It took us about 30,000 years to invent clothing and about 120,000 years to invent jewelry. It took us about 130,000 years to invent art and about 136,000 years to come up with the bow and arrow. But things began to speed up in the Upper Paleolithic period. Between 50,000 and 10,000 years ago, we developed more sophisticated tools with specialized uses—think harpoons, darts, fishing tools, and needles—early musical instruments, pottery, and the first domesticated animals. Between roughly 11,000 years and the 18th century, the pace truly accelerated. That period essentially led to the creation of civilization, with the foundations of our current world.

More recently, the Industrial Revolution changed everything because it moved us significantly further away from relying on the strength of people and domesticated animals to power means of production. Steam engines and machinery replaced backbreaking labor, meaning more production at a lower cost. The number of adjacent possibilities began to snowball. Machinery enabled mass production and interchangeable parts. Steam-powered trains meant people could move around far more easily, allowing people from different areas to mix together and share ideas. Improved communications did the same. It’s pointless to even try listing the ways technology has changed since then. Regardless of age, we’ve all lived through it and seen the acceleration. Few people dispute that the change is snowballing. The only question is how far that will go.

As Stephen Hawking put it in 1993:

For millions of years, mankind lived just like the animals. Then something happened which unleashed the power of our imagination. We learned to talk and we learned to listen. Speech has allowed the communication of ideas, enabling human beings to work together to build the impossible. Mankind’s greatest achievements have come about by talking, and its greatest failures by not talking. It doesn’t have to be like this. Our greatest hopes could become reality in the future. With the technology at our disposal, the possibilities are unbounded. All we need to do is make sure we keep talking.

But, as we saw with Moore’s Law, exponential growth cannot continue forever. Eventually, we run into fundamental constraints. Hours in the day, people on the planet, availability of a resource, smallest possible size of a semiconductor, attention—there’s always a bottleneck we can’t eliminate.  We reach the point of diminishing returns. Growth slows or stops altogether. We must then either look at alternative routes to improvement or leave things as they are. In Everett Rogers’s diffusion of innovation theory, this is known as the substitution stage, when usage declines and we start looking for substitutes.

This process is not linear. We can’t predict the future because there’s no way to take into account the tiny factors that will have a disproportionate impact in the long-run.

Footnotes
  • 1

    Image credit: tec_estromberg

The Danger of Oversimplification: How to Use Occam’s Razor Without Getting Cut

Occam’s razor (also known as the ‘law of parsimony’) is a problem-solving principle which serves as a useful mental model. A philosophical razor is a tool used to eliminate improbable options in a given situation, of which Occam’s is the best-known example.

Occam’s razor can be summarized as such:

Among competing hypotheses, the one with the fewest assumptions should be selected.

The Basics

In simpler language, Occam’s razor states that the simplest solution is correct. Another good explanation of Occam’s razor comes from the paranormal writer, William J. Hall: ‘Occam’s razor is summarized for our purposes in this way: Extraordinary claims demand extraordinary proof.’

In other words, we should avoid looking for excessively complex solutions to a problem and focus on what works, given the circumstances. Occam’s razor is used in a wide range of situations, as a means of making rapid decisions and establishing truths without empirical evidence. It works best as a mental model for making initial conclusions before adequate information can be obtained.

A further literary summary comes from one of the best-loved fictional characters, Arthur Conan Doyle’s Sherlock Holmes. His classic aphorism is an expression of Occam’s razor: “If you eliminate the impossible, whatever remains, however improbable, must be the truth.”

A number of mathematical and scientific studies have backed up its validity and lasting relevance. In particular, the principle of minimum energy supports Occam’s razor. This facet of the second law of thermodynamics states that, wherever possible, the use of energy is minimized. In general, the universe tends towards simplicity. Physicists use Occam’s razor, in the knowledge that they can rely on everything to use the minimum energy necessary to function. A ball at the top of a hill will roll down in order to be at the point of minimum potential energy. The same principle is present in biology. For example, if a person repeats the same action on a regular basis in response to the same cue and reward, it will become a habit as the corresponding neural pathway is formed. From then on, their brain will use less energy to complete the same action.

The History of Occam’s Razor

The concept of Occam’s razor is credited to William of Ockham, a 13-14th-century friar, philosopher, and theologian. While he did not coin the term, his characteristic way of making deductions inspired other writers to develop the heuristic. Indeed, the concept of Occam’s razor is an ancient one which was first stated by Aristotle who wrote “we may assume the superiority, other things being equal, of the demonstration which derives from fewer postulates or hypotheses.”

Robert Grosseteste expanded on Aristotle’s writing in the 1200s, declaring that:

That is better and more valuable which requires fewer, other circumstances being equal… For if one thing were demonstrated from many and another thing from fewer equally known premises, clearly that is better which is from fewer because it makes us know quickly, just as a universal demonstration is better than particular because it produces knowledge from fewer premises. Similarly, in natural science, in moral science, and in metaphysics the best is that which needs no premises and the better that which needs the fewer, other circumstances being equal.

Early writings such as this are believed to have led to the eventual, (ironic) simplification of the concept. Nowadays, Occam’s razor is an established mental model which can form a useful part of a latticework of knowledge.

Mental Model Occam's Razor

Examples of the Use of Occam’s Razor

Theology

In theology, Occam’s razor is used to prove or disprove the existence of God. William of Ockham, being a Christian friar, used his theory to defend religion. He regarded the scripture as true in the literal sense and therefore saw it as simple proof. To him, the bible was synonymous with reality and therefore to contradict it would conflict with established fact. Many religious people regard the existence of God as the simplest possible explanation for the creation of the universe.

In contrast, Thomas Aquinas used the concept in his radical 13th century work – The Summa Theologica. In it, he argued for atheism as a logical concept, not a contradiction of accepted beliefs. Aquinas wrote ‘it is superfluous to suppose that what can be accounted for by a few principles has been produced by many.’ He considered the existence of God to be a hypothesis which makes a huge number of assumptions, compared to scientific alternatives. Many modern atheists consider the existence of God to be unnecessarily complex, in particular, due to the lack of empirical evidence.

Taoist thinkers take Occam’s razor one step further, by simplifying everything in existence to the most basic form. In Taoism, everything is an expression of a single ultimate reality (known as the Tao.) This school of religious and philosophical thought believes that the most plausible explanation for the universe is the simplest- everything is both created and controlled by a single force. This can be seen as a profound example of the use of Occam’s razor within theology.

The Development of Scientific Theories

Occam’s razor is frequently used by scientists, in particular for theoretical matters. The simpler a hypothesis is, the more easily it can be proved or falsified. A complex explanation for a phenomenon involves many factors which can be difficult to test or lead to issues with the repeatability of an experiment. As a consequence, the simplest solution which is consistent with the existing data is preferred. However, it is common for new data to allow hypotheses to become more complex over time. Scientists chose to opt for the simplest solution the current data permits while remaining open to the possibility of future research allowing for greater complexity.

Failing to observe Occam’s razor is usually a sign of bad science and an attempt to cover poor explanations. The version used by scientists can best be summarized as: ‘when you have two competing theories that make exactly the same predictions, the simpler one is the better.’

Obtaining funding for simpler hypothesis tends to be easier, as they are often cheaper to prove. As a consequence, the use of Occam’s razor in science is a matter of practicality.

Albert Einstein referred to Occam’s razor when developing his theory of special relativity. He formulated his own version: ‘it can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.’ Or “everything should be made as simple as possible, but not simpler.” This preference for simplicity can be seen in one of the most famous equations ever devised: E=MC2. Rather than making it a lengthy equation requiring pages of writing, Einstein reduced the factors necessary down to the bare minimum. The result is usable and perfectly parsimonious.

The physicist Stephen Hawking advocates for Occam’s razor in A Brief History of Time:

We could still imagine that there is a set of laws that determines events completely for some supernatural being, who could observe the present state of the universe without disturbing it. However, such models of the universe are not of much interest to us mortals. It seems better to employ the principle known as Occam’s razor and cut out all the features of the theory that cannot be observed.

Isaac Newton used Occam’s razor too when developing his theories. Newton stated: “we are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.” As a result, he sought to make his theories (including the three laws of motion) as simple as possible, with the fewest underlying assumptions necessary.

Medicine

Modern doctors use a version of Occam’s razor, stating that they should look for the fewest possible causes to explain their patient’s multiple symptoms and also for the most likely causes. A doctor I know often repeats, “common things are common.” Interns are instructed, “when you hear hoofbeats, think horses, not zebras.” For example, a person displaying influenza-like symptoms during an epidemic would be considered more probable to be suffering from influenza than an alternative, rarer disease. Making minimal diagnoses reduces the risk of over treating a patient, or of causing dangerous interactions between different treatments. This is of particular importance within the current medical model, where patients are likely to see numerous different health specialists and communication between them can be poor.

Prison Abolition and Fair Punishment

Occam’s razor has long played a role in attitudes towards the punishment of crimes. In this context, it refers to the idea that people should be given the least punishment necessary for their crimes.

This is to avoid the excessive penal practices which were popular in the past, (for example, a Victorian could receive five years of hard labour for stealing a piece of food.) The concept of penal parsimony was pioneered by Jeremy Bentham, the founder of utilitarianism. He stated that punishments should not cause more pain than they prevent. Life imprisonment for murder could be seen as justified in that it may prevent a great deal of potential pain, should the perpetrator offend again. On the other hand, long-term imprisonment of an impoverished person for stealing food causes substantial suffering without preventing any.

Bentham’s writings on the application of Occam’s razor to punishment led to the prison abolition movement and our modern ideas of rehabilitation.

Crime solving and forensic work

When it comes to solving a crime, Occam’s razor is used in conjunction with experience and statistical knowledge. A woman is statistically more likely to be killed by a male partner than any other person. Should a female be found murdered in her locked home, the first person police interview would be any male partners. The possibility of a stranger entering can be considered, but the simplest possible solution with the fewest assumptions made would be that the crime was perpetrated by her male partner.

By using Occam’s razor, police officers can solve crimes faster and with fewer expenses.

Exceptions and Issues

It is important to note that, like any mental model, Occam’s razor is not failsafe and should be used with care, lest you cut yourself. This is especially crucial when it comes to important or risky decisions. There are exceptions to any rule, and we should never blindly follow a mental model which logic, experience, or empirical evidence contradict. The smartest people are those who know the rules, but also know when to ignore them. When you hear hoofbeats behind you, in most cases you should think horses, not zebras- unless you are out on the African savannah.

Simplicity is also a subjective topic- in the example of the NASA moon landing conspiracy theory, some people consider it simpler for them to have been faked, others for them to have been real. When using Occam’s razor to make deductions, we must avoid falling prey to confirmation bias and merely using it to backup preexisting notions. The same goes for the theology example mentioned previously – some people consider the existence of God to be the simplest option, others consider the inverse to be true. Semantic simplicity must not be given overt importance when selecting the solution which Occam’s razor points to. A hypothesis can sound simple, yet involve more assumptions than a verbose alternative.

Occam’s razor should not be used in the place of logic, scientific methods and personal insights. In the long term, a judgment must be backed by empirical evidence, not just its simplicity. Lisa Randall best expressed the issues with Occam’s razor in her book, Dark Matter and the Dinosaurs: The Astounding Interconnectedness of the Universe:

My second concern about Occam’s Razor is just a matter of fact. The world is more complicated than any of us would have been likely to conceive. Some particles and properties don’t seem necessary to any physical processes that matter—at least according to what we’ve deduced so far. Yet they exist. Sometimes the simplest model just isn’t the correct one.

Harlan Coben has disputed many criticisms of Occam’s razor by stating that people fail to understand its exact purpose:

Most people oversimplify Occam’s razor to mean the simplest answer is usually correct. But the real meaning, what the Franciscan friar William of Ockham really wanted to emphasize, is that you shouldn’t complicate, that you shouldn’t “stack” a theory if a simpler explanation was at the ready. Pare it down. Prune the excess.

I once again leave you with Einstein: “Everything should be made as simple as possible, but not simpler.

Occam’s razor is complemented by other mental models, including fundamental error distribution, Hanlon’s razor, confirmation bias, availability heuristic and hindsight bias. The nature of mental models is that they tend to all interlock and work best in conjunction.

Stephen Hawking Explains The Origin of the Universe

6043129

The Origin of the Universe, a lecture, by Stephen Hawking

According to the Boshongo people of central Africa, in the beginning, there was only darkness, water, and the great god Bumba. One day Bumba, in pain from a stomach ache, vomited up the sun. The sun dried up some of the water, leaving land. Still in pain, Bumba vomited up the moon, the stars, and then some animals. The leopard, the crocodile, the turtle, and finally, man.

This creation myth, like many others, tries to answer the questions we all ask. Why are we here? Where did we come from? The answer generally given was that humans were of comparatively recent origin, because it must have been obvious, even at early times, that the human race was improving in knowledge and technology. So it can’t have been around that long, or it would have progressed even more. For example, according to Bishop Usher, the Book of Genesis placed the creation of the world at 9 in the morning on October the 27th, 4,004 BC. On the other hand, the physical surroundings, like mountains and rivers, change very little in a human lifetime. They were therefore thought to be a constant background, and either to have existed forever as an empty landscape, or to have been created at the same time as the humans. Not everyone, however, was happy with the idea that the universe had a beginning.

For example, Aristotle, the most famous of the Greek philosophers, believed the universe had existed forever. Something eternal is more perfect than something created. He suggested the reason we see progress was that floods, or other natural disasters, had repeatedly set civilization back to the beginning. The motivation for believing in an eternal universe was the desire to avoid invoking divine intervention to create the universe and set it going. Conversely, those who believed the universe had a beginning, used it as an argument for the existence of God as the first cause, or prime mover, of the universe.

If one believed that the universe had a beginning, the obvious question was what happened before the beginning? What was God doing before He made the world? Was He preparing Hell for people who asked such questions? The problem of whether or not the universe had a beginning was a great concern to the German philosopher, Immanuel Kant. He felt there were logical contradictions, or antimonies, either way. If the universe had a beginning, why did it wait an infinite time before it began? He called that the thesis. On the other hand, if the universe had existed for ever, why did it take an infinite time to reach the present stage? He called that the antithesis. Both the thesis and the antithesis depended on Kant’s assumption, along with almost everyone else, that time was Absolute. That is to say, it went from the infinite past to the infinite future, independently of any universe that might or might not exist in this background. This is still the picture in the mind of many scientists today.

However in 1915, Einstein introduced his revolutionary General Theory of Relativity. In this, space and time were no longer Absolute, no longer a fixed background to events. Instead, they were dynamical quantities that were shaped by the matter and energy in the universe. They were defined only within the universe, so it made no sense to talk of a time before the universe began. It would be like asking for a point south of the South Pole. It is not defined. If the universe was essentially unchanging in time, as was generally assumed before the 1920s, there would be no reason that time should not be defined arbitrarily far back. Any so-called beginning of the universe would be artificial, in the sense that one could extend the history back to earlier times. Thus it might be that the universe was created last year, but with all the memories and physical evidence, to look like it was much older. This raises deep philosophical questions about the meaning of existence. I shall deal with these by adopting what is called, the positivist approach. In this, the idea is that we interpret the input from our senses in terms of a model we make of the world. One can not ask whether the model represents reality, only whether it works. A model is a good model if first it interprets a wide range of observations, in terms of a simple and elegant model. And second, if the model makes definite predictions that can be tested and possibly falsified by observation.

In terms of the positivist approach, one can compare two models of the universe. One in which the universe was created last year and one in which the universe existed much longer. The Model in which the universe existed for longer than a year can explain things like identical twins that have a common cause more than a year ago. On the other hand, the model in which the universe was created last year cannot explain such events. So the first model is better. One can not ask whether the universe really existed before a year ago or just appeared to. In the positivist approach, they are the same. In an unchanging universe, there would be no natural starting point. The situation changed radically however, when Edwin Hubble began to make observations with the hundred inch telescope on Mount Wilson, in the 1920s.

Hubble found that stars are not uniformly distributed throughout space, but are gathered together in vast collections called galaxies. By measuring the light from galaxies, Hubble could determine their velocities. He was expecting that as many galaxies would be moving towards us as were moving away. This is what one would have in a universe that was unchanging with time. But to his surprise, Hubble found that nearly all the galaxies were moving away from us. Moreover, the further galaxies were from us, the faster they were moving away. The universe was not unchanging with time as everyone had thought previously. It was expanding. The distance between distant galaxies was increasing with time.

The expansion of the universe was one of the most important intellectual discoveries of the 20th century, or of any century. It transformed the debate about whether the universe had a beginning. If galaxies are moving apart now, they must have been closer together in the past. If their speed had been constant, they would all have been on top of one another about 15 billion years ago. Was this the beginning of the universe? Many scientists were still unhappy with the universe having a beginning because it seemed to imply that physics broke down. One would have to invoke an outside agency, which for convenience, one can call God, to determine how the universe began. They therefore advanced theories in which the universe was expanding at the present time, but didn’t have a beginning. One was the Steady State theory, proposed by Bondi, Gold, and Hoyle in 1948.

In the Steady State theory, as galaxies moved apart, the idea was that new galaxies would form from matter that was supposed to be continually being created throughout space. The universe would have existed for ever and would have looked the same at all times. This last property had the great virtue, from a positivist point of view, of being a definite prediction that could be tested by observation. The Cambridge radio astronomy group, under Martin Ryle, did a survey of weak radio sources in the early 1960s. These were distributed fairly uniformly across the sky, indicating that most of the sources lay outside our galaxy. The weaker sources would be further away, on average. The Steady State theory predicted the shape of the graph of the number of sources against source strength. But the observations showed more faint sources than predicted, indicating that the density sources were higher in the past. This was contrary to the basic assumption of the Steady State theory, that everything was constant in time. For this, and other reasons, the Steady State theory was abandoned.

Another attempt to avoid the universe having a beginning was the suggestion that there was a previous contracting phase, but because of rotation and local irregularities, the matter would not all fall to the same point. Instead, different parts of the matter would miss each other, and the universe would expand again with the density remaining finite. Two Russians, Lifshitz and Khalatnikov, actually claimed to have proved, that a general contraction without exact symmetry would always lead to a bounce with the density remaining finite. This result was very convenient for Marxist Leninist dialectical materialism, because it avoided awkward questions about the creation of the universe. It therefore became an article of faith for Soviet scientists.

When Lifshitz and Khalatnikov published their claim, I was a 21 year old research student looking for something to complete my PhD thesis. I didn’t believe their so-called proof, and set out with Roger Penrose to develop new mathematical techniques to study the question. We showed that the universe couldn’t bounce. If Einstein’s General Theory of Relativity is correct, there will be a singularity, a point of infinite density and spacetime curvature, where time has a beginning. Observational evidence to confirm the idea that the universe had a very dense beginning came in October 1965, a few months after my first singularity result, with the discovery of a faint background of microwaves throughout space. These microwaves are the same as those in your microwave oven, but very much less powerful. They would heat your pizza only to minus 271 point 3 degrees centigrade, not much good for defrosting the pizza, let alone cooking it. You can actually observe these microwaves yourself. Set your television to an empty channel. A few percent of the snow you see on the screen will be caused by this background of microwaves. The only reasonable interpretation of the background is that it is radiation left over from an early very hot and dense state. As the universe expanded, the radiation would have cooled until it is just the faint remnant we observe today.

Although the singularity theorems of Penrose and myself, predicted that the universe had a beginning, they didn’t say how it had begun. The equations of General Relativity would break down at the singularity. Thus Einstein’s theory cannot predict how the universe will begin, but only how it will evolve once it has begun. There are two attitudes one can take to the results of Penrose and myself. One is to that God chose how the universe began for reasons we could not understand. This was the view of Pope John Paul. At a conference on cosmology in the Vatican, the Pope told the delegates that it was OK to study the universe after it began, but they should not inquire into the beginning itself, because that was the moment of creation, and the work of God. I was glad he didn’t realize I had presented a paper at the conference suggesting how the universe began. I didn’t fancy the thought of being handed over to the Inquisition, like Galileo.

The other interpretation of our results, which is favored by most scientists, is that it indicates that the General Theory of Relativity breaks down in the very strong gravitational fields in the early universe. It has to be replaced by a more complete theory. One would expect this anyway, because General Relativity does not take account of the small scale structure of matter, which is governed by quantum theory. This does not matter normally, because the scale of the universe is enormous compared to the microscopic scales of quantum theory. But when the universe is the Planck size, a billion trillion trillionth of a centimeter, the two scales are the same, and quantum theory has to be taken into account.

In order to understand the Origin of the universe, we need to combine the General Theory of Relativity with quantum theory. The best way of doing so seems to be to use Feynman’s idea of a sum over histories. Richard Feynman was a colorful character, who played the bongo drums in a strip joint in Pasadena, and was a brilliant physicist at the California Institute of Technology. He proposed that a system got from a state A, to a state B, by every possible path or history. Each path or history has a certain amplitude or intensity, and the probability of the system going from A- to B, is given by adding up the amplitudes for each path. There will be a history in which the moon is made of blue cheese, but the amplitude is low, which is bad news for mice.

The probability for a state of the universe at the present time is given by adding up the amplitudes for all the histories that end with that state. But how did the histories start? This is the Origin question in another guise. Does it require a Creator to decree how the universe began? Or is the initial state of the universe, determined by a law of science? In fact, this question would arise even if the histories of the universe went back to the infinite past. But it is more immediate if the universe began only 15 billion years ago. The problem of what happens at the beginning of time is a bit like the question of what happened at the edge of the world, when people thought the world was flat. Is the world a flat plate with the sea pouring over the edge? I have tested this experimentally. I have been round the world, and I have not fallen off. As we all know, the problem of what happens at the edge of the world was solved when people realized that the world was not a flat plate, but a curved surface. Time however, seemed to be different. It appeared to be separate from space, and to be like a model railway track. If it had a beginning, there would have to be someone to set the trains going. Einstein’s General Theory of Relativity unified time and space as spacetime, but time was still different from space and was like a corridor, which either had a beginning and end, or went on forever. However, when one combines General Relativity with Quantum Theory, Jim Hartle and I realized that time can behave like another direction in space under extreme conditions. This means one can get rid of the problem of time having a beginning, in a similar way in which we got rid of the edge of the world. Suppose the beginning of the universe was like the South Pole of the earth, with degrees of latitude playing the role of time. The universe would start as a point at the South Pole. As one moves north, the circles of constant latitude, representing the size of the universe, would expand. To ask what happened before the beginning of the universe would become a meaningless question, because there is nothing south of the South Pole.

Time, as measured in degrees of latitude, would have a beginning at the South Pole, but the South Pole is much like any other point, at least so I have been told. I have been to Antarctica, but not to the South Pole. The same laws of Nature hold at the South Pole as in other places. This would remove the age-old objection to the universe having a beginning; that it would be a place where the normal laws broke down. The beginning of the universe would be governed by the laws of science. The picture Jim Hartle and I developed of the spontaneous quantum creation of the universe would be a bit like the formation of bubbles of steam in boiling water.

The idea is that the most probable histories of the universe would be like the surfaces of the bubbles. Many small bubbles would appear, and then disappear again. These would correspond to mini universes that would expand but would collapse again while still of microscopic size. They are possible alternative universes but they are not of much interest since they do not last long enough to develop galaxies and stars, let alone intelligent life. A few of the little bubbles, however, grow to a certain size at which they are safe from recollapse. They will continue to expand at an ever increasing rate, and will form the bubbles we see. They will correspond to universes that would start off expanding at an ever increasing rate. This is called inflation, like the way prices go up every year.

The world record for inflation was in Germany after the First World War. Prices rose by a factor of ten million in a period of 18 months. But that was nothing compared to inflation in the early universe. The universe expanded by a factor of million trillion trillion in a tiny fraction of a second. Unlike inflation in prices, inflation in the early universe was a very good thing. It produced a very large and uniform universe, just as we observe. However, it would not be completely uniform. In the sum over histories, histories that are very slightly irregular will have almost as high probabilities as the completely uniform and regular history. The theory therefore predicts that the early universe is likely to be slightly non-uniform. These irregularities would produce small variations in the intensity of the microwave background from different directions. The microwave background has been observed by the Map satellite, and was found to have exactly the kind of variations predicted. So we know we are on the right lines.

The irregularities in the early universe will mean that some regions will have slightly higher density than others. The gravitational attraction of the extra density will slow the expansion of the region, and can eventually cause the region to collapse to form galaxies and stars. So look well at the map of the microwave sky. It is the blue print for all the structure in the universe. We are the product of quantum fluctuations in the very early universe. God really does play dice.

Follow your curiosity to Nassim Taleb on the Notion of Alternative Histories.

In Pursuit of the Unknown: 17 Equations That Changed the World

IN PURSUIT OF THE UNKNOWN

Equations are the lifeblood of mathematics, science, and technology. Without them, our world would not exist in its present form. However, equations have a reputation for being scary: Stephen Hawking’s publishers told him that every equation would halve the sales of A Brief History of Time.

Ignoring the advice, Hawking included E = mc² even ‘when cutting it out would have sold another 10 millions copies.” This captures our aversion to equations well. Yet, mathematician Ian Stewart argues in his book In Pursuit of the Unknown: 17 Equations That Changed the World, “[e]quations are too important to be hidden away.”

Equations are a vital part of this world. And you don’t need to be a rocket scientist to appreciate them.

There are two kinds of equations in mathematics. Stewart writes:

One kind presents relations between various mathematical quantities: the task is to prove the equation is true. The other kind provides information about an unknown quantity, and the mathematician’s task is to solve it – to make the unknown known. The distinction is not clear-cut, because sometimes the same equation can be used in both ways, but it’s a useful guideline.

PT

An example of the first kind of equation is Pythagoras’s theorem, which is “an equation expressed in the language of geometry.” If you accept Euclid’s basic geometric assumptions, then Pythagoras’s theorem must be true.

In the famous translation by Sir Thomas Heath, proposition 47 (Pythagoras’s theorem) of Book I reads:

In right-angled triangles the square on the side subtending the right angle is equal to the squares on the sides containing the right angle.

Many triangles in real life are not right angles. But this does not limit the use of the equation a²+b² = c² because any triangle can be cut into two right-angled ones.

two_rights

So understanding right-angled triangles is the key because “they prove that there is a useful relation between the shape of a triangle and the lengths of its sides.”

A good example of the second kind of equation is Newton’s law of gravity. “It tells us how the attractive force between two bodies depends on their masses,” Stewart writes, “and how far apart they are. Solving the resulting equations tells us how the planets orbit the Sun, or how to design a trajectory for a space probe.” This isn’t a mathematical theorem but rather it’s true for physical reasons in that it fits the observations.

Einstein’s general theory of relativity improves on Newton by fitting some observations better, while not messing up those where we already know Newton’s law does a good job.

Equations, as simple as they appear, have redirected human history time and time again. “An equation derives its power from a simple source,” Stewart writes, “it tells us that two calculations, which appear different, have the same answer.”

The power of equations lies in the philosophically difficult correspondence between mathematics, a collective creation of human minds, and an external physical reality. Equations model deep patterns in the outside world. By learning to value equations, and to read the stories they tell, we can uncover vital features of the world around us. This is the story of the ascent of humanity, told in 17 equations.

Pythagoras’s theorem

pt

The first equation presented in the book.

What does it tell us?
How the three sides of a right-angled triangle are related.
Why is that important?
It provides a vital link between geometry and algebra, allowing us to calculate distances in terms of coordinates. It also inspired trigonometry.
What did it lead to?
Surveying, navigation, and more recently special and general relativity – the best current theories of space, time, and gravity.

History.

The Greeks did not express Pythagoras’s theorem as an equation in the modern symbolic sense. That came later with the development of algebra. In ancient times, the theorem was expressed verbally and geometrically. It attained its most polished form, and its first recorded proof, in the writings of Euclid of Alexandria. Around 250 BC Euclid became the first modern mathematician when he wrote his famous Elements, the most influential mathematical textbook ever. Euclid turned geometry into logic by making his basic assumptions explicit and invoking them to give systematic proofs for all of his theorems. He built a conceptual tower whose foundations were points, lines, and circles, and whose pinnacle was the existence of precisely five regular solids.

For the purposes of higher mathematics, the Greeks worked with lines and areas instead of numbers. So Pythagoras and his Greek successors would decode the theorem as an equality of areas: ‘The area of a square constructed using the longest side of a right-angled triangle is the sum of the areas of the squares formed from the other two sides.’

Maps

Surveying began to take off in 1533 when the Dutch mapmaker Gemma Frisius explained how to use trigonometry to produce accurate maps, in Libellus de Locorum Describendorum Ratione (‘Booklet Concerning a Way of Describing Places’). Word of the method spread across Europe, reaching the ears of the Danish nobleman and astronomer Tycho Brahe. In 1579 Tycho used it to make an accurate map of Hven, the island where his observatory was located. By 1615 the Dutch mathematician Willebrord Snellius (Snel van Royen) had developed the method into essentially its modern form: triangulation. The area being surveyed is covered with a network of triangles. By measuring one initial length very carefully, and many angles, the locations of the corners of the triangle, and hence any interesting features within them, can be calculated. Snellius worked out the distance between two Dutch towns, Alkmaar and Bergen op Zoom, using a network of 33 triangles. He chose these towns because they lay on the same line of longitude and were exactly one degree of arc apart.Knowing the distance between them, he could work out the size of the Earth, which he published in his Eratosthenes Batavus (‘The Dutch Eratosthenes’) in 1617. His result is accurate to within 4%. He also modified the equations of trigonometry to reflect the spherical nature of the Earth’s surface, an important step towards effective navigation.

Triangulation is an indirect way of determining distance by employing angles.

When surveying a stretch of land, be it a building site or a country, the main practical consideration is that it is much easier to measure angles than it is to measure distances. Triangulation lets us measure a few distances and lots of angles; then everything else follows from the trigonometric equations. The method begins by setting out one line between two points, called the baseline, and measuring its length directly to very high accuracy. Then we choose a prominent point in the landscape that is visible from both ends of the baseline, and measure the angle from the baseline to that point, at both ends of the baseline. Now we have a triangle, and we know one side of it and two angles, which fix its shape and size. We can then use trigonometry to work out the other two sides.

In effect, we now have two more baselines: the newly calculated sides of the triangle. From those, we can measure angles to other, more distant points. Continue this process to create a network of triangles that covers the area being surveyed. Within each triangle, observe the angles to all noteworthy features – church towers, crossroads, and so on. The same trigonometric trick pinpoints their precise locations. As a final twist, the accuracy of the entire survey can be checked by measuring one of the final sides directly.

Surveys routinely employed triangulation by the late 18th century. And while we don’t explicitly use it today, it is still there in how we deduce locations from satellite data.

In Pursuit of the Unknown: 17 Equations That Changed the World is an elegant argument for why equations matter.


Farnam Street has a free weekly newsletter every Sunday. It offers the week’s best articles and an assortment of other brain food. Sign up.