摘要
快速准确识别病原菌在防止传染病的传播、帮助对抗抗菌素耐药性和改善病人预后方面起着关键作用。拉曼光谱结合机器学习算法能够简单快捷地对病原菌进行无标记检测。然而,病原菌种类和表型繁多,并且深度学习需要依赖大量样本训练,而收集大批量病原菌拉曼光谱劳神费力,且易受荧光等因素影响。针对上述问题,提出一种基于WGAN-GP数据增强方法和ResNet结合的病原菌拉曼光谱检测模型。采用五种常见眼科病原菌的拉曼光谱。将采集到的原始数据归一化作为ResNet和传统卷积神经网络(1D-CNN)的输入,将经过SG滤波、airPLS基线校正、PCA降维等预处理后的数据作为K近邻(KNN)的输入,对比分析发现ResNet模型效果最优,其分类精度可达96%;搭建Wasserstein生成式对抗网络加梯度惩罚模型(WGAN-GP),生成大量与真实数据相似的高分辨率光谱数据。同时与偏移法、深度卷积生成式对抗神经网络(DCGAN)2种数据增强方法进行比对,证明WGAN-GP的可靠性;为验证生成数据可以丰富数据多样性,进而提高分类精度,将扩充后的数据集重新放入ResNet进行训练,最终WGAN-GP结合ResNet的分类准确率提高到99.3%。结果表明:基于ResNet的分类模型无需复杂数据预处理,在开发效率和分类精度上均有提高;改进的WGAN-GP模型适用于拉曼光谱数据增强,解决了传统数据增强方法生成光谱的有效性与类别准确性不匹配的问题,相比于GAN提高了训练过程的速度和稳定性;利用表面增强拉曼光谱技术(SERS)结合WGANGP-ResNet模型对病原菌拉曼光谱分类,减少了对大量训练数据的需求,有利于快速学习和分析低信噪比的拉曼光谱,并将光谱采集时间缩减到1/10。在临床快速、免培养鉴别病原菌方面具有重要研究意义与应用价值。
The expeditious identification of pathogenic bacteria plays a prominent role in preventing the spread of infectious diseases,helping combat antimicrobial resistance,and improving patient prognosis.Raman spectroscopy combined with machine learning algorithms can provide simple and fast label-free detection of pathogenic bacteria.However,pathogenic bacteria are diverse and phenotypic.However,deep learning relies on many samples for training,while collecting Raman spectra of large batches of pathogenic bacteria is laborious and vulnerable to factors such as fluorescence.To address the above problems,a pathogenic bacteria Raman spectroscopy detection model based on the combination of the WGAN-GP data enhancement method and ResNet is proposed.Raman spectra of five common ophthalmic pathogenic bacteria were used.The collected raw data are normalized as the input of ResNet and ordinary convolutional neural network(1D-CNN),SG filtering,airPLS baseline correction,PCA data downscaling data preprocessing as the input of K nearest neighbor algorithm(KNN),and the comparative analysis finds that the ResNet model works best and its classification accuracy can reach 96%;build Wasserstein Generative Adversarial Network with Gradient Penalty Model(WGAN-GP)is built to generate a large amount of high-resolution spectral data similar to the real data.In order to verify that the generated data can enrich the data diversity and thus improve the classification accuracy,the expanded dataset was re-entered into the ResNet model for training,and the classification accuracy of WGAN-GP combined with ResNet was finally improved.The classification accuracy of WGAN-GP combined with ResNet was improved to 99.3%.The improved WGAN-GP model is suitable for Raman spectral data enhancement,which solves the problem of mismatch between the validity of the spectra generated by traditional data enhancement methods and the accuracy of the categories.The surface-enhanced Raman spectroscopy(SERS)combined with the WGANGP-ResNet model established by this method for pathogenic bacteria Raman spectra classification reduces the need for a large amount of training data,facilitates rapid learning and analysis of Raman spectra with a low signal-to-noise-ratio,and reduces the spectra acquisition time to 1/10.It has important research significance and application value in pathogenic bacteria's rapid and culture-free clinical identification.
作者
孟星志
刘亚秋
刘丽娜
MENG Xing-zhi;LIU Ya-qiu;LIU Li-na(College of Information and Computer Engineering,Northeast Forestry University,Harbin 150040,China)
出处
《光谱学与光谱分析》
SCIE
EI
CAS
CSCD
北大核心
2024年第2期542-547,共6页
Spectroscopy and Spectral Analysis
基金
国家自然科学基金面上项目(61975028)资助。
关键词
WGAN-GP
拉曼光谱
病原菌鉴别
一维残差网络
卷积神经网络
WGAN-GP
Raman spectroscopy
Pathogen identification
One-dimensional residual network
Convolutional neural network