Chess, Proteins, and the Fantastical Promise of Statistical Learning
A (not-so) deep dive into the ideas surrounding this year's Breakthrough Prize in the Life Sciences
Disclaimer: This blog represents my own views and opinions and not those of my employer. Thanks for reading!
A couple weeks ago, I found myself at Chess Forum, a beautifully quaint chess parlour in the East Village. The place looks like it's been taken straight out of Harry Potter’s Diagon Alley, frozen in some bygone era while the rest of downtown Manhattan has passed it by. People inside speak in hushed whispers. The air is thick with a quiet reverence. It’s a place where I could easily imagine myself spending an afternoon just mesmerized by my surroundings, but not today. Today, I was sweating bullets.
I was attempting to analyze the board in front of me, trying to think just two or three moves ahead. “If I play X, then she’ll play Y, then I’ll play Z, then she’ll play… wait, what did I play in the first place?” I was getting flustered, unable to handle the sheer combinatorial complexity of the position.
I was losing. Drake’s new album hadn’t dropped yet. Times were tough.
This year’s Breakthrough Prize in the Life Sciences, one of the most coveted awards in science along with the Nobel Prize, was awarded to Dr. John Jumper and Dr. Demis Hassabis of Google DeepMind for their AlphaFold 2 algorithm. When the announcement was made a few months ago, my friends and I weren’t hugely surprised. The “AlphaFold revolution” has taken the world of structural biology by storm since the original algorithm’s publication in the journal Nature in July 2021. Solving the 3D structure of a protein based on its sequence - the protein folding problem - is one of biology’s grand challenges, an open question that scientists have worked on for decades. In the 14th biannual Critical Assessment of Structure Prediction (CASP14) competition, AlphaFold crushed all previous benchmarks for protein structure prediction performance, marking the first time that the accuracy of state-of-the-art experimental methods had been achieved in silico.
As stated most candidly in a commentary piece by the University of Oxford’s Protein Informatics Group soon after CASP14,
“[DeepMind’s] results were so astounding, and the problem so central to biology, that it took the entire world by surprise and left an entire discipline … wondering what had just happened.”
AlphaFold has arguably heralded a new era of biology, one unfettered by lack of structural information. In the short while that the algorithm has been available for use, it has already resulted in massive leaps in our understanding of various biochemical pathways, the development of new therapeutic modalities, and faster drug design.
But, let’s pause for a minute. These are only the results. AlphaFold’s story doesn’t start there. Our story begins with the game of chess.
Chess is a funny thing. The number of possible move combinations in a single game, denoted the Shannon number, has a (conservative) lower bound on the order of 10^120. This is a mind-bogglingly large number, around 40 orders of magnitude greater than the number of atoms in the universe.
Say you want to make your own chess engine. Given infinite computational power, a reasonable idea would be to tell a computer to play out every single combination of possible moves to the end of the game, and then suggest the move that results in the greatest number of won games. This brute force strategy seems sound in principle, but keeping in mind the Shannon number above, we quickly realize that this would be impossible to implement in reality.
So, how do humans play chess? The best players have intuition built off of patterns that they have trained to recognize over many thousands of games. They learn to identify imbalances in positions and make thematic moves to exploit them. Even though they may not have searched through the entirety of the game-tree before deciding on a move, they have enough experience to come up with a “locality” of possible moves, on which they can then conduct a more computationally efficient “local optimization” by analyzing this pared down game-tree.
This seems like the best way to play a game when you can’t deal with massive combinatorics, when the possible candidate area is too vast to conduct a truly complete search. You’d like to 1) find a reasonable guess for a local area to search in (in our chess example, a smaller list of candidate moves), and then 2) perform a local search to find the optimum amongst these candidate moves. The strategy the best players use - pattern recognition, position imbalances, thematic moves - feel so utterly vague and abstract that they seem innately human. How could a computer possibly grasp the intuition behind chess?
Enter DeepMind. Long before they made their foray into the world of protein folding, they developed AlphaZero, a chess engine founded on a neural network working in tandem with a Markov Chain tree search (MCTS) algorithm. The idea was to develop a statistical learning framework that started from no chess knowledge except for the basic rules of chess (ground zero, hence the name). The neural network learns to suggest candidate moves by developing a rough estimate of how “strong” the position looks for a side, then the MCTS refines these candidate moves by searching for the optimum amongst them. This refinement gets fed back into the network that in turn analyzes and refines the MCTS over many iterations. Over the course of two days, AlphaZero played hundreds of thousands of games against itself, quickly becoming the strongest chess engine in the world at the time of its release. But was it learning human intuition? Garry Kasparov, former world chess champion and commonly regarded as one of the greatest (humans) to ever play the game, mused that AlphaZero “seems to think in terms of strategy rather than tactics … like a human with uncanny vision.” Indeed, just last week, the DeepMind team published a study in PNAS that establishes quantitative evidence that AlphaZero is indeed learning chess strategy over its training cycles, including recreating some of the most brilliant ideas developed by human players over the centuries, then discarding them for entirely novel and superior strategies that we can’t quite comprehend yet. [This one of my favorite papers I’ve read all year, and honestly probably deserves a cheeky blog post of its own, but anyone interested can give it a read here.]
DeepMind didn’t stop there. They repurposed their artificial intelligence systems to tackle more difficult games. But the question remains: how does an AI group that had gained notoriety for solving games like chess, Go, Starcraft, and Quake III Arena just stroll into the field of computational biophysics and obliterate the competition on one of the most difficult problems in science, basically out of nowhere?
Anyone who has endured my rambling about my research knows that problems like this keep me up at night. Modeling stochastic variability is fascinating because randomness makes up the very essence of the world around us. It’s fun to think about our lives as aggregations of likelihoods rather than sequences of absolutes. But these recent advances take studying probabilities to a whole new level. When we can make groundbreaking developments in one field, like game theory, and then translate them to another, like protein folding, this, to me, points to some deeper understanding, some more fundamental truth that we’ve learned about the world that can cross disciplines. More on this thought later.
Let’s get back to AlphaFold and the Breakthrough Prize.
Without getting too technical, there are a couple of big ideas I want to introduce very briefly that will frame our subsequent discussion on proteins and their folding.
Think of proteins as tiny molecular machines that perform essentially every important biological task on planet Earth. They are made up of chains of substituent chemical building blocks called amino acids. Just as our alphabet is used to build words, amino acid building blocks can be strung together in different orders and lengths to generate an astoundingly diverse array of proteins.
A protein’s function is determined by its structure. The process of contortion that a protein undergoes as it transforms from a linear chain of amino acids to structured mass is known as ‘folding’. This process occurs (mostly) spontaneously, driven by the laws of thermodynamics. Generally, proteins fold to minimize their free energy - high energy conformations are unstable and are not favored. Here, as with chess, we encounter the problem of combinatorics. Enter Levinthal’s paradox: it would take longer than the age of the universe for a protein to sample all of its possible conformations to end up at its lowest-energy structure. Obviously, this isn’t happening in our bodies. Well-defined energetic pathways are necessary for proteins to fold at reasonable timescales; essentially, the laws of physics enable proteins to quickly find a reasonably stable structure, then refine it to find the global energetic minimum for the protein.
The protein folding problem is grounded in a tantalizing idea: if we knew the rules of biophysics and biochemistry perfectly, we should, theoretically, be able to ascertain the exact 3D structure of a protein given only its sequence.
John Jumper, who currently leads the AlphaFold team and was first-author on DeepMind’s 2021 Nature paper, studied physics at Cambridge and worked as Scientific Associate at D. E. Shaw Research before completing his graduate work in theoretical chemistry at the University of Chicago. Demis Hassabis studied computer science at Cambridge, pioneered AI-based videogame design at Lionhead Studios, and completed his graduate studies in cognition and neuroscience at UCL, then MIT, then Harvard.
Once again, we see a spark at the confluence of disciplines and backgrounds. With Jumper and co.’s experience in computation and biology, and Hassabis and co.’s experience building AI for games, the ingredients for AlphaFold are ready.
At a high level, the AlphaFold algorithm utilizes two pieces of information that it builds from an input protein sequence from a user: 1) a multiple sequence alignment (MSA) of protein sequences, and 2) a “pair representation” of protein structure. An MSA encodes evolutionary information from related protein sequences. This can be used to glean information on protein residue contacts using a neat idea founded in statistical theory called coevolution. The pair representation can be thought of as a structure template, or an initial guess, of what the protein structure could be. These two starting points are passed through a special variant of a machine learning model called a transformer that the DeepMind team calls the EvoFormer. This is one of the major innovations at the heart of the AlphaFold algorithm. It allows the MSA and the pair representation to exchange information, and refine one another over successive iterations. This refined MSA and pair representation are used to output the final predicted protein structure. It’s astounding to me how similar the overall algorithmic architectures underlying AlphaZero and AlphaFold are. AlphaZero took “two half-working parts” - the neural network and the MCTS - and had them refine one another as the engine played chess against itself. AlphaFold’s two half-working parts are the MSA and the pair representation, each not sufficient to independently teach the algorithm the rules of protein folding. But together, as two different representations of the physical world, they can exchange information to edge the algorithm closer to reality.
So we’ve learned this: give a complex computer program a bunch of evolutionarily related protein sequences and a template, it will ruminate on that for a bit, then somehow spit out a dizzyingly accurate protein structure. Remember- we haven’t told the algorithm anything substantial about physics, chemistry, or biology. It is learning something profound about the laws of our world hidden somewhere within the data we supply it, and it’s using this learned information to build more beautifully than we ever could. If this isn’t mind blowing, I don’t know what is.
Machine learning and artificial intelligence can sometimes feel like magic to me. I think it’s a bit unsatisfying to stop there, however. My friend (nay, my fellow lad) and colleague James Roney probed the question of AlphaFold interpretability during his undergraduate thesis work at Harvard under the supervision of one Sergey Ovchinnikov, a luminary in the structure prediction field. In a series of elegant experiments, they showed that AlphaFold has likely learned an accurate biophysical potential function for judging the compatibility of sequence and structure. In other words, through its training, AlphaFold has actually learned the physical rules that allow it to assess the relative quality of structures, given a protein sequence. James showed that the MSA input required by AlphaFold is needed just so that the algorithm can come up with a good approximation for the structure - a locality to search in. The learned potential function does the rest, performing a local search to come up with the final prediction. In essence, AlphaFold has learned the intuition behind protein folding and biophysics.
Earlier this week, I asked m’boy James whether he believed the protein folding problem was solved.
His answer:
“Eh, ‘solved’ is a big word.”
James explained to me that AlphaFold was a big step in a lot of practical ways, and more-or-less a solution to the problem of predicting static structures for proteins that have reasonably deep MSAs or solved homologs. He qualified this by explaining that we have a long way to go before understanding protein dynamics, the transition states that proteins take on as they shuffle from unfolded state to folded state. In other words, even though AlphaFold might be quite good at solving a protein’s most stable (minimum potential energy) structure when you give it a lot of evolutionary information, it can’t tell us much about how the protein actually folded. To give an analogy, we might know a lot about the final scene of a movie, but we can’t actually watch the whole thing.
Capturing this ‘protein movie’ - not just the last act - is actually quite important for a lot of real-world reasons - we need it to understand protein folding pathways, folding kinetics, and to study the heterogeneity of proteins in solution, amongst other biologically relevant problems.
Arguably the seminal paper on in silico protein dynamics came out of D. E. Shaw Research in the journal Science in 2010, a study on the conformational changes of villin, amongst other proteins (see our friend John Jumper listed amongst the co-authors!). This paper was a landmark achievement- for the first time, the twisting and turning of proteins was captured on a millisecond-scale molecular dynamics simulation using a special-purpose supercomputer called Anton. Many interesting biological phenomena occur on these long timescales, but until this work, running these extended simulations had been computationally intractable. Lots of interesting work is being done on this problem, not just by using purely computation, but also by incorporating rich experimental data into a computational framework (see, for example, Dr. Ellen Zhong’s graduate work at MIT, at DESRES, and now as a professor at Princeton using deep learning to create “protein movies” from heterogeneous cryo-EM samples). This mechanistic insight into protein folding is, in my opinion, the true crux of the protein folding problem, and we’re still far from solving it.
What’s fascinating to me is that this question is now being tackled by companies with some of the most sophisticated computational infrastructures on the planet. Along with Google, Meta AI and Microsoft Research also run massive, well-funded biomolecular structure elucidation groups. Each group has taken their expertise in one domain (for example, general-purpose AI) and tried to convert it into an asymmetric advantage in another (molecular dynamics or protein folding). Just days ago, Meta AI published a paper introducing EMSFold, an algorithm that is not as accurate as AlphaFold, but is about 60 times faster at predicting protein structures, christened ‘protein autocomplete’ by its authors.
It’s a really intriguing strategy, and it will be exciting to see if it pays dividends in the future. If I had to guess, bridging disciplines and building scalable, cross-functional technology like these companies are will result in some of the splashiest advances in science and mathematics over the next few years.
Having been working as an Associate for three months now, I feel like I’m definitely in the “valley of despair” portion of the Dunning-Kruger curve. There’s so much to learn. But realizing how exciting things are getting, experiencing the sheer magic of these algorithms in the field, is so thrilling. I feel very lucky to be surrounded by brilliant people who are fueled by getting to the bottom of tough problems. Their kindness, patience, and drive inspire me to get better at my craft each day. I’m not one with the Matrix quite yet, but I’m trying.
Until then, I’ll get back to losing games at Chess Forum.
---
If you made it this far, thank you so much for reading! Feel free to leave comments below or send me emails with questions/clarifications/corrections. I plan to touch on a wide array of topics (not just science/math-related) in the future. If you would like to join in on my journey toward becoming a literal NPC bot AI, subscribe to get an email notification each time I write. I hope to be posting regularly.
P.S. I would highly recommend watching the short film that DeepMind released on their AlphaGo algorithm here. It brought a tear to my eye. Not many things do that, but AlphaGo did.
Absolutely top class lad right here!
Unbelievable that this is your first blog. This was really well done despite me definitely struggling to understand a lot of the technical jargon. Looking forward to seeing more of your journey in this incredible field. Now waiting on Drakes next album… times are indeed tough.