One big challenge we all face in life is knowing when to explore new opportunities, and when to double down on existing ones. Explore vs exploit algorithms – and poetry – teach us that it’s vital to consider how much time we have, how we can best avoid regrets, and what we can learn from failures.
“Had we but world enough, and time,
This coyness, Lady, were no crime.
We would sit down and think which way
To walk and pass our long love’s day . . .
Let us roll all our strength and all
Our sweetness up into one ball,
And tear our pleasures with rough strife
Through the iron gates of life:
Thus, though we cannot make our sun
Stand still, yet we will make him run.”
—Andrew Marvell, To His Coy Mistress
Of all the questions life demands we answer, “To explore or to exploit?” is one we have to confront almost every day. Do we keep trying new restaurants? Do we keep learning new ideas? Do we keep making new friends? Or do we enjoy what we’ve come to find and love?
There is no doubt that humans are great at exploring, as most generalist species are. Not content to stay in that cave, hunt that animal, or keep doing it the way our grandmother taught us, humans owe at least part of our success due to our willingness to explore.
But when is what you’ve already explored enough? When can you finally settle down to enjoy the fruits of your exploration? When can you be content to exploit the knowledge you already have?
Turns out that there are algorithms for that.
In Algorithms to Live By, authors Brian Christian and Tom Griffiths devote an entire chapter to how computer algorithms deal with the explore/exploit conundrum and how you can apply those lessons to the same tension in your life.
How much time do you have?
One of the most important factors in determining whether to continue exploring or to exploit what you’ve got is time. Christian and Griffiths explain that “seizing a day and seizing a lifetime are two entirely different endeavors. . . . When balancing favorite experiences and new ones, nothing matters as much as the interval over which we plan to enjoy them.”
Time intervals can be a construct of your immediate circumstances, like the boundaries provided by a two-week vacation. For a lot of us, the last night in a lovely foreign place will see us eating at the best restaurant we have found so far. Time intervals can also be considered over the arc of your life in general. Children are consummate explorers, but as we grow up, the choice to exploit becomes more of a daily decision. How would your choices today be impacted if you knew you were going to live another five years? Twenty years? Forty years? Christian and Griffiths advise, “Explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in.”
“I have known days like that, of warm winds drowsing in the heat
of noon and all of summer spinning slowly on its reel,
days briefly lived, that leave long music in the mind
more sweet than truth: I play them and rewind.”
—Russell Hoban, Summer Recorded
Sometimes we are too quick to stop exploring. We have these amazing days and magical experiences, and we want to keep repeating them forever. However, changes in ourselves and the world around us are inevitable, and so committing to a path of exploitation too early leaves us unable to adapt. As much as it can be hard to walk away from that perfect day, Christian and Griffiths explain that “exploration in itself has value, since trying new things increases our chances of finding the best. So taking the future into account, rather than focusing just on the present, drives us toward novelty.”
“Like as the waves make towards the pebbled shore,
So do our minutes hasten to their end;
Each changing place with that which goes before,
In sequent toil all forwards do contend.”
—William Shakespeare, Sonnet 60
There is no doubt that for many of us time is our most precious resource. We never seem to have enough, and we want to maximize the value we get from how we choose to use it. So when deciding between whether to enjoy what you have or search for something better, adding time to your decision-making process can help point the way.
Minimizing the pain of regret
The threat of regret looms over many explore/exploit considerations. We can regret both not searching for something better and not taking the time to enjoy what we already have. The problem with regret is that we don’t have it in advance of a poor decision. Sometimes, second-order thinking can be used as a preventative tool. But often it is when you look back over a decision that regret comes out. Christian and Griffiths define regret as “the result of comparing what we actually did with what would have been best in hindsight.”
“Does the road wind uphill all the way?
Yes, to the very end.
Will the day’s journey take the whole long day?
From morn to night, my friend.
Shall I find comfort, travel-sore and weak?
Of labour you shall find the sum.
Will there be beds for me and all who seek?
Yea, beds for all who come.”
—Christina Rossetti, Up-Hill
If we want to minimize regret, especially in exploration, we can try to learn from those who have come before. As we choose to wander forth into new territory, however, it’s natural to wonder if we’ll regret our decision to try something new. According to Christian and Griffiths, the mathematics that underlie explore/exploit algorithms show that “you should assume the best about [new people and new things], in the absence of evidence to the contrary. In the long run, optimism is the best prevention for regret.” Why? Because by being optimistic about the possibilities that are out there, you’ll explore enough that the one thing you won’t regret is missed opportunity.
(This is similar to one of the most effective strategies in game theory: tit for tat. Start out by being nice, then reciprocate whatever behavior you receive. It often works better paired with the occasional bout of forgiveness.)
“Tell me, tell me, smiling child,
What the past is like to thee?
‘An Autumn evening soft and mild
With a wind that sighs mournfully.’
Tell me, what is the present hour?
‘A green and flowery spray
Where a young bird sits gathering its power
To mount and fly away.’
And what is the future, happy one?
‘A sea beneath a cloudless sun;
A mighty, glorious, dazzling sea
Stretching into infinity.’”
—Emily Bronte, Past, Present, Future
The accumulation of knowledge
Christian and Griffiths write that “it’s rare that we make an isolated decision, where the outcome doesn’t provide us with any information that we’ll use to make other decisions in the future.” Not all of our explorations are going to lead us to something better, but many of them are. Not all of our exploitations are going to be satisfying, but with enough exploration behind us, many of them will. Failures are, after all, just information we can use to make better explore or exploit decisions in the future.
“You know—at least you ought to know,
For I have often told you so—
That children are never allowed
To leave their nurses in a crowd.
Now this was Jim’s especial foible,
He ran away when he was able,
And on this inauspicious day
He slipped his hand and ran away!
He hadn’t gone a yard when—Bang!
With open jaws, a lion sprang,
And hungrily began to eat
The boy: beginning at his feet.”
—Hilaire Belloc, Jim Who Ran Away from His Nurse, and Was Eaten by a Lion
Most importantly, we shouldn’t let our early exploration mishaps prevent us from continuing to push our boundaries as we grow up. Exploration is necessary in order to exploit and enjoy the knowledge hard won along the way.