What a difference a year makes. Twelve months ago, the artificial intelligence (AI) company DeepMind stunned many scientists by publishing the predicted structures for some 350,000 proteins, a body of work recognized ScienceThe breakthrough of 2021. Yesterday, DeepMind and its partners went much, much further. The company has released the probable structures of nearly all known proteins, more than 200 million from bacteria to humans, an impressive achievement for AI and a potential treasure trove for drug development and evolutionary research.
“We are now releasing structures for the entire protein universe,” Demis Hassabis, founder and CEO of DeepMind, said at a news conference in London.
The structural reward comes from AlphaFold, one of the emerging AI programs that has cracked the protein folding problem, the age-old problem of accurately deriving the 3D shapes of proteins from their amino acid sequences. The newly predicted structures of AlphaFold were entered into the existing database yesterday thanks to a partnership with the European Bioinformatics Institute (EMBL-EBI) of the European Molecular Biology Laboratory. The database “provided structural biologists with this powerful new tool where you can look up the 3D structure of a protein almost as easily as you can Google a keyword,” Hassabis said.
Eric Topol, director of the Scripps Broadcast Research Institute, echoed the surprise of many outside scientists. “AlphaFold is an exceptional and landmark advancement in life science that demonstrates the power of artificial intelligence,” he tweeted. “With this new addition of structures that illuminate almost the entire protein universe, we can expect more biological mysteries to be unraveled every day.”
Big day for #AI in life science. Release of over 200 million predicted 3D protein structures from open source #AlphaFoldalmost the entire protein universe
See: https://t.co/gjASHqACqa @DeepMind
my comment below pic.twitter.com/yPgtPHMZac
— Eric Topol (@EricTopol) July 28, 2022
The release of DeepMind’s framework is “excellent,” Ewan Birney, EMBL’s deputy director general, told a news conference. “It will make many researchers around the world think about what experiments they can do now.”
The proteins that AlphaFold recognizes come from a variety of organisms, from bacteria to plants and vertebrates, including mice, zebrafish and humans. Catherine Tunyasuvunakol, a researcher at DeepMind, said it took AlphaFold about 10 to 20 seconds to make each protein prediction. The company had to work closely with EMBL-EBI, she said, to figure out how to represent the huge number of structures in the database.
DeepMind says more than 500,000 researchers have already used the database since it launched last year. Hassabis predicted a “new era in digital biology” where drug developers can move from AI-predicted protein structures important to any disease to using AI to design small molecules that affect those proteins — and thus to treatment of the disease.
Others use structure predictions to develop vaccine candidates, investigate basic biological questions, such as how the so-called nuclear pore complex controls which molecules enter the cell’s nucleus, or study the evolution of proteins when life first arose.
Hassabis, however, cautioned that the release of structures is only a starting point. “Obviously, there’s still a lot of biology and a lot of chemistry that needs to be done.”