Computational design of proteins is a relatively new field, where scientists search the enormous sequence space for sequences that can fold into desired structure and perform desired functions. With the computational ...Computational design of proteins is a relatively new field, where scientists search the enormous sequence space for sequences that can fold into desired structure and perform desired functions. With the computational approach, proteins can be designed, for example, as regulators of biological processes, novel enzymes, or as biotherapeutics. These approaches not only provide valuable information for understanding of sequence-structure-function relations in proteins, but also hold promise for applications to protein engineering and biomedical research. In this review, we briefly introduce the rationale for computational protein design, then summarize the recent progress in this field, including de novo protein design, enzyme design, and design of protein-protein interactions. Challenges and future prospects of this field are also discussed.展开更多
On 9 October 2024,in a high-profile vote of confidence for the promise of using artificial intelligence(AI)in scientific discovery,the Royal Swedish Academy of Sciences awarded Demis Hassabis(co-founder and chief exec...On 9 October 2024,in a high-profile vote of confidence for the promise of using artificial intelligence(AI)in scientific discovery,the Royal Swedish Academy of Sciences awarded Demis Hassabis(co-founder and chief executive officer)and John M.Jumper(direc-tor)of Google DeepMind(London,UK)the 2024 Nobel Prize in Chemistry for their pioneering work in developing the AI-powered protein structure prediction model AlphaFold2(AF2)[1].Also shar-ing the prize was David Baker(half to Hassabis and Jumper;half to Baker),professor of biochemistry at the University of Washington(Seattle,WA,USA),for his work on computational protein design that started with the mid-1990s development of Rosetta,a since-evolving suite of software tools that model protein structures using physical principles[2]-and now also AI[3].展开更多
The protein inverse folding problem,designing amino acid sequences that fold into desired protein structures,is a critical challenge in biological sciences.Despite numerous data-driven and knowledge-driven methods,the...The protein inverse folding problem,designing amino acid sequences that fold into desired protein structures,is a critical challenge in biological sciences.Despite numerous data-driven and knowledge-driven methods,there remains a need for a user-friendly toolkit that effectively integrates these approaches for in-silico protein design.In this paper,we present DIProT,an interactive protein design toolkit.DIProT leverages a non-autoregressive deep generative model to solve the inverse folding problem,combined with a protein structure prediction model.This integration allows users to incorporate prior knowledge into the design process,evaluate designs in silico,and form a virtual design loop with human feedback.Our inverse folding model demonstrates competitive performance in terms of effectiveness and efficiency on TS50 and CATH4.2 datasets,with promising sequence recovery and inference time.Case studies further illustrate how DIProT can facilitate user-guided protein design.展开更多
基金supported by the National Basic Research Program of China(Grant No.2015CB910300)the National High Technology Research and Development Program of China(Grant No.2012AA020308)the National Natural Science Foundation of China(Grant No.11021463)
文摘Computational design of proteins is a relatively new field, where scientists search the enormous sequence space for sequences that can fold into desired structure and perform desired functions. With the computational approach, proteins can be designed, for example, as regulators of biological processes, novel enzymes, or as biotherapeutics. These approaches not only provide valuable information for understanding of sequence-structure-function relations in proteins, but also hold promise for applications to protein engineering and biomedical research. In this review, we briefly introduce the rationale for computational protein design, then summarize the recent progress in this field, including de novo protein design, enzyme design, and design of protein-protein interactions. Challenges and future prospects of this field are also discussed.
文摘On 9 October 2024,in a high-profile vote of confidence for the promise of using artificial intelligence(AI)in scientific discovery,the Royal Swedish Academy of Sciences awarded Demis Hassabis(co-founder and chief executive officer)and John M.Jumper(direc-tor)of Google DeepMind(London,UK)the 2024 Nobel Prize in Chemistry for their pioneering work in developing the AI-powered protein structure prediction model AlphaFold2(AF2)[1].Also shar-ing the prize was David Baker(half to Hassabis and Jumper;half to Baker),professor of biochemistry at the University of Washington(Seattle,WA,USA),for his work on computational protein design that started with the mid-1990s development of Rosetta,a since-evolving suite of software tools that model protein structures using physical principles[2]-and now also AI[3].
基金This work was supported by the National Natural Science Foundation of China(Nos.62250007,62225307,61721003)a grant from the Guoqiang Institute,Tsinghua University(2021GQG1023).
文摘The protein inverse folding problem,designing amino acid sequences that fold into desired protein structures,is a critical challenge in biological sciences.Despite numerous data-driven and knowledge-driven methods,there remains a need for a user-friendly toolkit that effectively integrates these approaches for in-silico protein design.In this paper,we present DIProT,an interactive protein design toolkit.DIProT leverages a non-autoregressive deep generative model to solve the inverse folding problem,combined with a protein structure prediction model.This integration allows users to incorporate prior knowledge into the design process,evaluate designs in silico,and form a virtual design loop with human feedback.Our inverse folding model demonstrates competitive performance in terms of effectiveness and efficiency on TS50 and CATH4.2 datasets,with promising sequence recovery and inference time.Case studies further illustrate how DIProT can facilitate user-guided protein design.