Historic Moment! Human genetic code cracked by Google AI, new DeepMind work on Nature

Newsflash2dys agoupdate AiFun
29 0

Early this morning, 2024 Nobel Prize in Chemistry winner and GoogleDeepMind (scientific research organization) CEO Demis Hassabis led the team that developed the AI genome model - theAlphaGenome, was featured on the cover of the latest issue of the top journal Nature. This is after AlphaFold, DeepMind another heavyweight life science research on Nature.

历史性一刻! 人类基因密码被谷歌AI破解,DeepMind新作登Nature

AlphaGenome is designed to addressA long unsolved puzzle in biology: The human genome contains approximately98%The non-coding regions of genes that do not directly produce proteins but regulate gene turn-on, splicing, and expression, and whose variation is often closely associated with disease risk, are difficult to analyze by conventional means.

To this end.DeepMind research team builds a new AI architectureThe inputs can behundred thousandDNA sequences of base length withSingle base resolutionThe following are some of the most important features of the RNA system, such as predicting RNA expression, splicing structure, chromatin accessibility, transcription factor binding sites, and even three-dimensional structure.Nearly 6,000 regulatory features.

The project paper, titled "Advancing regulatory variant effect prediction with AlphaGenome," is alsoNext-generation AI model achieves unified modeling of human gene regulation “from sequence to function”..

历史性一刻! 人类基因密码被谷歌AI破解,DeepMind新作登Nature

The genetic code of life accumulated over the past 4 billion years is now being re-decoded by AI tools in a “unified modeling” approach.

DeepMind blogged about this project in June 2025, when AlphaGenome opened its preview API to the research community, with a focus on building a more explanatory and generalizable model of DNA sequences that could be put to use as a “general-purpose variant-reading engine” for research.

In the version officially published in Nature, the DeepMind team not only completed theFull Modal Performance Evaluationeven moreDemonstration of AlphaGenome's reasoning power in multiple disease mutation mechanisms, including how to accurately predictTAL1 oncogenic mutationsactivation mechanism, comprehensively validating the AlphaGenome in theSplice, expression, chromatin stateThe ability to predict on key pathways such as

The researchers believe that this model will provideLocalization of rare disease causation, discovery of novel therapeutic targets, and synthetic biology designand other directions to provide powerful and versatile tools.

Link to paper:https://www.nature.com/articles/s41586-025-10014-0

I. Millions of DNA inputs and base-level predictions, breaking through the “long sequence” and “high resolution” problems.

One of the core innovations of AlphaGenome is the first time that the length of the input DNA sequence has been increased to one million bases (1Mb), while maintaining base-level prediction accuracy at the output layer.

This breaks the trade-off between “long sequences” and “high resolution” in previous models. For example, although previous models such as SpliceAI have high resolution, they are limited to short sequences of 10,000 bases or less, making it difficult to capture long-distance regulation.

While Enformer and other models can handle long sequences of 200,000 to 500,000 bases, they need to sacrifice precision by using 128bp, etc. as a unit for binning predictions, and cannot accurately depict the fine-grained structure of splice sites, enhancers and promoters.

In the training process, AlphaGenome adopts a two-stage architecture of “pre-training + distillation”, which preserves details and expands the context through parallel processing of multiple TPUs, sequence parallelization, U-Net structure and Transformer combination.

历史性一刻! 人类基因密码被谷歌AI破解,DeepMind新作登Nature

▲AlphaGenome Model Architecture, Training Scheme and Comprehensive Evaluation Performance

existAcross 24 genomic trajectory missionsIn the middle, AlphaGenome is in the22 itemsbeyond the current best model on; onOf the 26 variance effects predictedYes25 itemsMeet or exceed the performance of the current SOTA model.

历史性一刻! 人类基因密码被谷歌AI破解,DeepMind新作登Nature

▲ The predictive performance on the genomic track of the research team was evaluated on a case-by-case basis

II. Harmonization of forecasting mechanisms and “one-click overview” of the impact of variability”

Unlike traditional models that require training different networks separately for different tasks (e.g., splicing, expression, chromatin structure), theAlphaGenome is the first unified model to simultaneously output 11 types of modality predictions in a single inference..

It supports that from a sequence of DNA, theA direct extrapolation ofRegulatory features such as RNA expression levels, splice sites and usage, chromatin accessibility, transcription factor binding sites, histone modification patterns, and three-dimensional contact mapping, and are applicable toHuman and mouse cellsin thousands of different cell or tissue types.

This “multimodal association” structure brings a new perspective to mutation analysis.

Researchers can take any DNA variant and model it to quickly predict its effects at multiple regulatory levels and compare the differences between the reference and mutated versions to infer whether the variant leads to up-regulation of expression, altered splicing, or a change in chromatin state.

In particular, the paper shows how the modelSuccessful prediction of the pathway mechanism of oncogenic mutations activating the TAL1 gene, validating its practical value in non-coding region variant interpretation.

历史性一刻! 人类基因密码被谷歌AI破解,DeepMind新作登Nature

Example of multimodal prediction of TAL1 oncogenic mutations in T-ALL by ▲AlphaGenome

C. Upgraded splicing prediction capability is expected to boost rare disease research

RNA splicingAbnormalities are at the root of many rare diseases (e.g., spinal muscular atrophy, cystic fibrosis), but traditional AI models tend to identify only the splice site itself, and it is difficult to comprehensively analyze the splice usage rate and splice junction pattern (splice junction).

AlphaGenome introduced for the first time in a modelingDirect predictability of clipping junctions(splice junction modeling), combined with site prediction and usage rate analysis, to construct a more complete splicing regulatory map.

In datasets such as GTEx.The model successfully predicts the effects of multiple known disease-causing mutations on splicing, on datasets such as ClinVar and MPRA alsoGetting the best current assessment score, AlphaGenome was the best performer on six of the seven splice effect tasks.

历史性一刻! 人类基因密码被谷歌AI破解,DeepMind新作登Nature

▲AlphaGenome reaches SOTA level in splicing variant effect prediction task

This capability is an important contribution to understanding how non-coding variants trigger pathological splicing for the development of novel diagnostic methods.

Conclusion: After AlphaFold, DeepMind uses AI to solve the “Book of Life” again”

The emergence of AlphaGenome not only establishes a new technological baseline for DNA sequence modeling, but also opens a new window for life science researchers to observe the whole picture of genetic regulation.

Its ability to cover a wide range of modalities, support long sequence inputs, and possess single-base prediction accuracy makes it widely promising for decoding gene regulatory codes, understanding the pathways of mutation effects, and guiding synthetic DNA design, which provides a universal tool base for next-generation disease mechanism research, rare disease diagnosis, and synthetic biology.

With the model open to academia, AlphaGenome could be a strong successor to the “genetic AlphaFold”.

© Copyright notes

Related articles

No comments

none
No comments...