为解决现有的事件抽取方法在实体抽取子任务中难以充分利用上下文信息,导致事件抽取精度较低的问题,提出了基于跨度和图卷积网络的篇章级事件抽取(document-level event extraction based on span and graph convolutional network, DEE...为解决现有的事件抽取方法在实体抽取子任务中难以充分利用上下文信息,导致事件抽取精度较低的问题,提出了基于跨度和图卷积网络的篇章级事件抽取(document-level event extraction based on span and graph convolutional network, DEESG)模型。首先,设计中间线性层对编码的向量进行线性处理,并结合标注信息计算最佳跨度,通过提升对跨度开始位置和结束位置判断的准确度来提高实体抽取的精度;接着,提出异构图的构建方法,使用池化策略将实体与句子表示为图的节点,根据提出的建边规则构建异构图,以此建立全局信息的交互,并利用多层图卷积网络(graph convolutional network, GCN)对异构图进行卷积,获得具有上下文信息的实体表示和句子表示,以此解决上下文信息利用不充分的问题;然后,利用多头注意力机制进行事件类型的检测;最后,为组合中的实体分配论元角色,完成事件抽取任务。在中文金融公告(Chinese financial announcements, ChFinAnn)数据集上进行实验。结果表明,与拥有追踪器的异构图交互模型(graph-based interaction model with a tracker, GIT)相比,DEESG模型的F1分数提升了1.3个百分点。该研究证实DEESG模型能有效应用于篇章级事件抽取领域。展开更多
癌症是全球范围内导致死亡的主要疾病之一,尤其是对晚期或发生转移的癌症治疗依然面临巨大的挑战。癌症的精准分期在临床上对治疗方案的选择和患者预后评估至关重要。传统的分期方法主要依赖影像学和临床检查数据,然而随着基因组学和分...癌症是全球范围内导致死亡的主要疾病之一,尤其是对晚期或发生转移的癌症治疗依然面临巨大的挑战。癌症的精准分期在临床上对治疗方案的选择和患者预后评估至关重要。传统的分期方法主要依赖影像学和临床检查数据,然而随着基因组学和分子生物学技术的飞速发展,利用多组学数据进行癌症的早期诊断和分期变得越来越重要。为了提高癌症分类和分期的准确性,本研究提出了一种新的多组学数据分析框架MOGCWMLP。该框架基于图卷积网络(GCN)对不同组学数据进行特征学习,结合加权多层感知机(MLP)网络进行分类决策。具体来说,MOGCWMLP框架集成了RNA-seq、miRNA和lncRNA等三种不同类型的组学数据,通过学习每种数据的特征并进行加权融合,最大化不同组学数据的互补信息。实验结果表明,MOGCWMLP模型在肺鳞癌(LUSC)数据集上的分类精度显著优于现有的单组学模型和多组学模型,尤其是在多组学数据整合的情况下,分类性能得到显著提升。此外,采用可学习的加权融合机制,能够动态调整各视图的贡献,从而进一步优化模型的分类效果。该研究为癌症精准诊断和个性化治疗提供了有效的工具,也为多组学数据的整合提供了新的思路。Cancer remains one of the leading causes of mortality worldwide, particularly in advanced or metastatic cases, where treatment remains a significant challenge. Accurate cancer staging is critical in clinical practice for determining optimal treatment strategies and assessing patient prognosis. Traditional staging methods primarily rely on imaging and clinical examination data. However, with rapid advancements in genomics and molecular biology, lever aging multi-omics data for early cancer diagnosis and staging has become increasingly important. To enhance the accuracy of cancer classification and staging, this study proposes an ovel multi-omics data analysis framework, MOGCWMLP. This framework utilizes graph convolutional networks (GCN) for feature learning across different omics data types and incorporates a weighted multilayer perceptron (MLP) for classification decision-making. Specifically, MOGCWMLP integrates three distinct types of omics data—mRNA, miRNA, and lncRNA—by extracting and fusing their features through a weighted mechanism, there by maximizing the complementary information among different omics modalities. Experimental results demonstrate that the MOGCWMLP model achieves significantly higher classification accuracy on the lung squamous cell carcinoma (LUSC) dataset compared to existing single-omics and multi-omics models. Notably, the integration of multi-omics data leads to substantial improvements in classification performance. Furthermore, the incorporation of a learnable weighted fusion mechanism enables the dynamic adjustment of each modality’s contribution, further optimizing the model’s classification effectiveness. This study provides an effective tool for precise cancer diagnosis and personalized treatment, while also offering new insights into the integration of multi-omics data.展开更多
文摘为解决现有的事件抽取方法在实体抽取子任务中难以充分利用上下文信息,导致事件抽取精度较低的问题,提出了基于跨度和图卷积网络的篇章级事件抽取(document-level event extraction based on span and graph convolutional network, DEESG)模型。首先,设计中间线性层对编码的向量进行线性处理,并结合标注信息计算最佳跨度,通过提升对跨度开始位置和结束位置判断的准确度来提高实体抽取的精度;接着,提出异构图的构建方法,使用池化策略将实体与句子表示为图的节点,根据提出的建边规则构建异构图,以此建立全局信息的交互,并利用多层图卷积网络(graph convolutional network, GCN)对异构图进行卷积,获得具有上下文信息的实体表示和句子表示,以此解决上下文信息利用不充分的问题;然后,利用多头注意力机制进行事件类型的检测;最后,为组合中的实体分配论元角色,完成事件抽取任务。在中文金融公告(Chinese financial announcements, ChFinAnn)数据集上进行实验。结果表明,与拥有追踪器的异构图交互模型(graph-based interaction model with a tracker, GIT)相比,DEESG模型的F1分数提升了1.3个百分点。该研究证实DEESG模型能有效应用于篇章级事件抽取领域。
文摘癌症是全球范围内导致死亡的主要疾病之一,尤其是对晚期或发生转移的癌症治疗依然面临巨大的挑战。癌症的精准分期在临床上对治疗方案的选择和患者预后评估至关重要。传统的分期方法主要依赖影像学和临床检查数据,然而随着基因组学和分子生物学技术的飞速发展,利用多组学数据进行癌症的早期诊断和分期变得越来越重要。为了提高癌症分类和分期的准确性,本研究提出了一种新的多组学数据分析框架MOGCWMLP。该框架基于图卷积网络(GCN)对不同组学数据进行特征学习,结合加权多层感知机(MLP)网络进行分类决策。具体来说,MOGCWMLP框架集成了RNA-seq、miRNA和lncRNA等三种不同类型的组学数据,通过学习每种数据的特征并进行加权融合,最大化不同组学数据的互补信息。实验结果表明,MOGCWMLP模型在肺鳞癌(LUSC)数据集上的分类精度显著优于现有的单组学模型和多组学模型,尤其是在多组学数据整合的情况下,分类性能得到显著提升。此外,采用可学习的加权融合机制,能够动态调整各视图的贡献,从而进一步优化模型的分类效果。该研究为癌症精准诊断和个性化治疗提供了有效的工具,也为多组学数据的整合提供了新的思路。Cancer remains one of the leading causes of mortality worldwide, particularly in advanced or metastatic cases, where treatment remains a significant challenge. Accurate cancer staging is critical in clinical practice for determining optimal treatment strategies and assessing patient prognosis. Traditional staging methods primarily rely on imaging and clinical examination data. However, with rapid advancements in genomics and molecular biology, lever aging multi-omics data for early cancer diagnosis and staging has become increasingly important. To enhance the accuracy of cancer classification and staging, this study proposes an ovel multi-omics data analysis framework, MOGCWMLP. This framework utilizes graph convolutional networks (GCN) for feature learning across different omics data types and incorporates a weighted multilayer perceptron (MLP) for classification decision-making. Specifically, MOGCWMLP integrates three distinct types of omics data—mRNA, miRNA, and lncRNA—by extracting and fusing their features through a weighted mechanism, there by maximizing the complementary information among different omics modalities. Experimental results demonstrate that the MOGCWMLP model achieves significantly higher classification accuracy on the lung squamous cell carcinoma (LUSC) dataset compared to existing single-omics and multi-omics models. Notably, the integration of multi-omics data leads to substantial improvements in classification performance. Furthermore, the incorporation of a learnable weighted fusion mechanism enables the dynamic adjustment of each modality’s contribution, further optimizing the model’s classification effectiveness. This study provides an effective tool for precise cancer diagnosis and personalized treatment, while also offering new insights into the integration of multi-omics data.