365NEWSX
365NEWSX
Subscribe

Welcome

DeepMind puts the entire human proteome online, as folded by AlphaFold - Yahoo Finance Australia

DeepMind puts the entire human proteome online, as folded by AlphaFold - Yahoo Finance Australia

DeepMind puts the entire human proteome online, as folded by AlphaFold - Yahoo Finance Australia
Jul 22, 2021 4 mins, 39 secs

DeepMind and several research partners have released a database containing the 3D structures of nearly every protein in the human body, as computationally determined by the breakthrough protein folding system demonstrated last year, AlphaFold.

The AlphaFold Protein Structure Database is a collaboration between DeepMind, the European Bioinformatics Institute and others, and consists of hundreds of thousands of protein sequences with their structures predicted by AlphaFold — and the plan is to add millions more to create a "protein almanac of the world.".

"We believe that this work represents the most significant contribution AI has made to advancing the state of scientific knowledge to date, and is a great example of the kind of benefits AI can bring to society," said DeepMind founder and CEO Demis Hassabis.

If you're not familiar with proteomics in general — and it's quite natural if that's the case — the best way to think about this is perhaps in terms of another major effort: that of sequencing the human genome.

And one of the next big projects everyone turned their eyes toward in those years was understanding the human proteome — which is to say all the proteins used by the human body and encoded into the genome.

The problem with the proteome is that it's much, much more complex.

Proteins, like DNA, are sequences of known molecules; in DNA these are the handful of familiar bases (adenine, guanine, etc.), but in proteins they are the 20 amino acids (each of which is coded by multiple bases in genes).

It's like going from binary code to a complex language that manifests objects in the real world.

Practically speaking this means that the proteome is made up of not just 20,000 sequences of hundreds of acids each, but that each one of those sequences has a physical structure and function.

This is generally done experimentally using something like x-ray crystallography, a long, complex process that may take months or longer to figure out a single protein — if you happen to have the best labs and techniques at your disposal.

The structure can also be predicted computationally, though the process has never been good enough to actually rely on — until AlphaFold came along?

Alphabet’s DeepMind achieves historic new milestone in AI-based protein structure prediction.

Then AI-based approaches came on the scene, making a splash in 2019 when DeepMind's AlphaFold leapfrogged every other system in the world — then made another jump in 2020, achieving accuracy levels high enough and reliable enough that it prompted some experts to declare the problem of turning an arbitrary sequence into a 3D structure solved.

It's the practical results that concern us today, as the company employed its time since the publication of AlphaFold 2 (the version shown in 2020) not just tweaking the model, but running it...

on every single protein sequence they could get their hands on.

The result is that 98.5% of the human proteome is now "folded," as they say, meaning there is a predicted structure that the AI model is confident enough (and importantly, we are confident enough in its confidence) represents the real thing.

Oh, and they also folded the proteome for 20 other organisms, like yeast and E.

coli, amounting to about 350,000 protein structures total.

All that will be made available as a freely browsable database that any researcher can simply plug a sequence or protein name into and immediately be provided the 3D structure.

"The database as you'll see it tomorrow, it's a search bar, it's almost like Google search for protein structures," said Hassabis in an interview with TechCrunch.

So you can immediately go and see related genes, And it's linked to all these other databases, you can see related genes, related in other organisms, other proteins that have related functions, and so on.".

And I think this is accelerating science by steps of years, a bit like being able to sequence genomes did decades ago.".

Ordinarily examining the proteins suspected of being at the root of a given problem would be expensive and time-consuming, and for diseases that affect relatively few people, money and time are in short supply when they can be applied to more common problems like cancers or dementia-related diseases.

But being able to simply call up the structures of 10 healthy proteins and 10 mutated versions of the same, insights may appear in seconds that might otherwise have taken years of painstaking experimental work.

"When we first sent our seven sequences to the DeepMind team, for two of those we already had experimental structures.

The process AlphaFold uses to predict structures is, in some cases, better than experimental options.

that it's taking this, this 1D amino acid chain and creating these beautiful 3D structures, a lot of them aesthetically incredibly beautiful, as well as scientifically and functionally valuable.

The impact of AlphaFold and the proteome database won't be felt for some time at large, but it will almost certainly — as early partners have testified — lead to some serious short-term and long-term breakthroughs.

AlphaFold solves a very specific, though very important problem: given a sequence of amino acids, predict the 3D shape that sequence takes in reality.

In fact a great deal of the human proteins for which AlphaFold gave only a middling level of confidence to its predictions may be fundamentally "disordered" proteins that are too variable to pin down the way a more static one can be (in which case the prediction would be validated as a highly accurate predictor for that type of protein).

Summarized by 365NEWSX ROBOTS

RECENT NEWS

SUBSCRIBE

Get monthly updates and free resources.

CONNECT WITH US

© Copyright 2024 365NEWSX - All RIGHTS RESERVED