A very slow discovery challenge

8/4/2023

Ideally, we'd be able to determine the shape from the amino acid sequence alone. But it is fundamental to understanding what a protein can do, and how it does it. Understanding the shape of a protein doesn't tell you everything about its function. How do proteins form larger complexes?.What antigens can an antibody protein bind to, as part of an immune response?.

It matters because understanding the shape is crucial for understanding questions like: For a single protein it will routinely take months of work to experimentally determine the corresponding 3-dimensional structure, usually using x-ray crystallography or cryo-electron microscopy or NMR. But proteins are tiny 3-dimensional structures, typically nanometers across, making it extremely difficult to directly image them. In fact, for the vast majority of the hundreds of millions of proteins known 3, all we can easily directly determine is the basic blueprint: using genome sequencing we can find the linear sequence of amino acids that form the protein, at a cost of no more than cents.

We have no instruction manual, and we're trying to figure it out. But while this wealth of biological machines is astonishing, we don't a priori know what those machines do, or how they do it. There are, for instance, thousands of papers about the kinesin superfamily, and yet we're really just beginning to understand it. Every individual machine could be the subject of a lifetime's study. All these and many more molecular machines, created and sorted by the demanding sieve of evolution by natural selection. You have green fluorescent protein, which emits green light when exposed to ultraviolet light, and which can be used to tag and track other biomolecules. You have haemoglobin, which carries oxygen in your blood, and helps power your metabolism. You have, for example, kinesin proteins, which transport large molecules around the interior of the cell. Large databases like UniProt contain the amino acid sequences for hundreds of millions of proteins. Molecular biology feels a bit like wandering into an immense workshop full of wonderful and varied machines. Let me take a few minutes to remind you of some background for those of you (like me) who aren't biologists. That said, it's been enjoyable and often inspiring to learn about proteins. I am not a molecular biologist, so my apologies for any errors on that front. For these reasons, it's valuable for metascientists to engage with AlphaFold and successor systems, even if you have no prior interest in proteins or even in biology. In my talk, I'll treat AlphaFold as a concrete prototype for how AI may be used across science. Most of all: whether and how AI systems may impact the progress of science as a whole, as a systemic intervention. These include: what we expect a good theory or explanation to provide what it means to validate that understanding and what we humans can learn from these systems. While AlphaFold is impressive, it's also far from complete: it's really a bridge to a new era, opening up many scientific and metascientific questions. In a few cases, AlphaFold 2 has even exceeded experimental accuracy, causing existing experimental results to be re-evaluated and improved. These results were not cherrypicked, but rather emerged from an adversarial, blind competition with over a hundred other modeling groups. That is, using just the linear sequence of amino acids making up a protein, AlphaFold was able to predict the positions of the atoms in the protein. In 2020 the deep learning system AlphaFold 2 1 surprised biologists when it was shown to routinely make correct near atomic-precision predictions for protein structure. The text for a talk given at the Metascience 2023 Conference in Washington, D.C., May 2023. Grounding an understanding of that longer-term future, and of the Provide anyway? The focus of the talk is extant results and the near Used to identify general principles that human scientists can learnįrom? And what should we expect a good theory or explanation to Questions including: how can such systems be validated? Can they be

In this short survey talk I briefly discuss While obviously of interest to molecularīiologists, I believe this is of much broader interest for science asĪ whole, as a concrete prototype for how artificial intelligence may Thisīreakthrough has helped set off an ongoing deep learning revolution in Protein from the sequence of amino acids making up that protein. Scientific problem: how to predict the 3-dimensional structure of a This is aĭeep learning system which has made large strides on a fundamental Success of AI in science so far is the AlphaFold 2 system. How is AI impacting science? Michael Nielsen Astera Institute May 14, 2023

0 Comments

A very slow discovery challenge

Leave a Reply.

Author

Archives

Categories