Ended: Feb. 1, 2020
Simons once quoted Benjamin, the donkey in Animal Farm, to explain his attitude: “‘God gave me a tail to keep off the flies. But I’d rather have had no tail and no flies.’ That’s kind of the way I feel about publicity.”
“The lesson was: Do what you like in life, not what you feel you ‘should’ do,” Simons says. “It’s something I never forgot.”
Markov chains, which are sequences of events in which the probability of what happens next depends only on the current state, not past events. In a Markov chain, it is impossible to predict future steps with certainty, yet one can observe the chain to make educated guesses about possible outcomes. Baseball can be seen as a Markov game. If a batter has three balls and two strikes, the order in which they came and the number of fouls in between don’t matter. If the next pitch is a strike, the batter is out. A hidden Markov process is one in which the chain of events is governed by unknown, underlying parameters or variables. One sees the results of the chain but not the “states” that help explain the progression of the chain. Those not acquainted with baseball might throw their hands up when receiving updates of the number of runs scored each inning—one run in this inning, six in another, with no obvious pattern or explanation. Some investors liken financial markets, speech recognition patterns, and other complex chains of events to hidden Markov models. The Baum-Welch algorithm provided a way to estimate probabilities and parameters within these complex sequences with little more information than the output of the processes. For the baseball game, the Baum-Welch algorithm might enable even someone with no understanding of the sport to guess the game situations that produced the scores. If there was a sudden jump from two runs to five runs, for example, Baum-Welch might suggest the probability that a three-run home run had just been hit rather than a bases-loaded triple. The algorithm would allow someone to infer a sense of the sport’s rules from the distribution of scores, even as the full rules remained hidden. “The Baum-Welch algorithm gets you closer to the final answer by giving you better probabilities,” Welch explains. Baum usually minimized the importance of his accomplishment. Today, though, Baum’s algorithm, which allows a computer to teach itself states and probabilities, is seen as one of the twentieth century’s notable advances in machine learning, paving the way for breakthroughs affecting the lives of millions in fields from genomics to weather prediction. Baum-Welch enabled the first effective speech recognition system and even Google’s search engine.
Throughout the 1980s, applied mathematicians and ex-physicists were recruited to work on Wall Street and in the City of London. They usually were tasked with building models to place values on complicated derivatives and mortgage products, analyze risk, and hedge, or protect, investment positions, activities that became known as forms of financial engineering.
Edward Thorp became the first modern mathematician to use quantitative strategies to invest sizable sums of money. Thorp was an academic who had worked with Claude Shannon, the father of information theory, and embraced the proportional betting system of John Kelly, the Texas scientist who had influenced Elwyn Berlekamp. First, Thorp applied his talents to gambling, gaining prominence for his large winnings as well as his bestselling book, Beat the Dealer. The book outlined Thorp’s belief in systematic, rules-based gambling tactics, as well as his insight that players can take advantage of shifting odds within games of chance. In 1964, Thorp turned his attention to Wall Street, the biggest casino of them all. After reading books on technical analysis—as well as Benjamin Graham and David Dodd’s landmark tome, Security Analysis, which laid the foundations for fundamental investing—Thorp was “surprised and encouraged by how little was known by so many,” he writes in his autobiography, A Man for All Markets.7
Thorp’s trading formula was influenced by the doctoral thesis of French mathematician Louis Bachelier, who, in 1900, developed a theory for pricing options on the Paris stock exchange using equations similar to those later employed by Albert Einstein to describe the Brownian motion of pollen particles. Bachelier’s thesis, describing the irregular motion of stock prices, had been overlooked for decades, but Thorp and others understood its relevance to modern investing.
No one ever made a decision because of a number. They need a story. Daniel Kahneman, economist
Early one evening, his eyes blurry from staring at his computer screen for hours on end, Magerman spotted something odd: A line of simulation code used for Brown and Mercer’s trading system showed the Standard & Poor’s 500 at an unusually low level. This test code appeared to use a figure from back in 1991 that was roughly half the current number. Mercer had written it as a static figure, rather than as a variable that updated with each move in the market. When Magerman fixed the bug and updated the number, a second problem—an algebraic error—appeared elsewhere in the code. Magerman spent most of the night on it but he thought he solved that one, too. Now the simulator’s algorithms could finally recommend an ideal portfolio for the Nova system to execute, including how much borrowed money should be employed to expand its stock holdings.
(A classic industry joke: Extroverted mathematicians are the ones who stare at your shoes during a conversation, not their own.)
By 1997, Medallion’s staffers had settled on a three-step process to discover statistically significant moneymaking strategies, or what they called their trading signals. Identify anomalous patterns in historic pricing data; make sure the anomalies were statistically significant, consistent over time, and nonrandom; and see if the identified pricing behavior could be explained in a reasonable way.