摘要
蛋白质是一切生命体的物质基础,是生命活动的主要承担者,参与各种生理功能的调节。设计具有特定功能的蛋白质在蛋白质工程、生物医药、材料科学等领域具有重要意义。蛋白质序列设计的目标是设计能够折叠成期望结构并具有相应功能的氨基酸序列,是所有理性蛋白质工程的核心问题,具有极其重要的研究和应用潜力。随着蛋白质序列数据的指数型增长和深度学习技术的快速发展,生成模型越来越多地被应用于蛋白质序列设计。本文简要介绍了蛋白质序列设计的重要意义和主要方法,概述了应用于蛋白质序列设计的主要生成模型,介绍了近年来生成模型在蛋白质序列表示、生成和优化方面的最新研究和应用现状,并对未来的发展方向进行讨论与展望。
Protein is the material basis of all livings,which is the main bearer of life activity and participates in the regulation of physiological functions.Designing proteins with specific functions is of great significance in the fields of protein engineering,biomedicine,and material science.Protein sequence design refers to the design and identification of amino acid sequences that can fold into the desired structure with the desired function.Protein sequence design is the core of rational protein engineering and has great potentials for research and application.With the exponential growth of protein sequence data and the rapid development of deep learning technology,generative models are increasingly used in protein sequence design.This review briefly introduces the significance of protein sequence design and the methods developed for protein sequence design.The principles of the four main generative models used for protein sequence design are discussed in detail.Reports on the latest research and application of generative models in protein sequence representation,generation,and optimization over the past several years are presented.Finally,the future developments of protein sequence design are outlooked.
作者
伍青林
任玉彬
翟小威
陈东
刘凯
WU Qing-Lin;REN Yu-Bin;ZHAI Xiao-Wei;CHEN Dong;LIU Kai(College of Energy Engineering,Zhejiang University,Hangzhou 310012,China;Department of Chemistry,Tsinghua University,Beijing 100084,China)
出处
《应用化学》
CAS
CSCD
北大核心
2022年第1期3-17,共15页
Chinese Journal of Applied Chemistry
基金
国家自然科学基金(No.21878258)
浙江省自然科学基金(No.Y20B060027)和资助。
关键词
蛋白质序列设计
生成模型
变分自动编码器
生成对抗网络
表示学习
强化学习
Protein sequence design
Generative model
Variational autoencoder
Generative adversarial network
Representation learning
Reinforcement learning