Graph convolutional networks(GCNs)have received significant attention from various research fields due to the excellent performance in learning graph representations.Although GCN performs well compared with other meth...Graph convolutional networks(GCNs)have received significant attention from various research fields due to the excellent performance in learning graph representations.Although GCN performs well compared with other methods,it still faces challenges.Training a GCN model for large-scale graphs in a conventional way requires high computation and storage costs.Therefore,motivated by an urgent need in terms of efficiency and scalability in training GCN,sampling methods have been proposed and achieved a significant effect.In this paper,we categorize sampling methods based on the sampling mechanisms and provide a comprehensive survey of sampling methods for efficient training of GCN.To highlight the characteristics and differences of sampling methods,we present a detailed comparison within each category and further give an overall comparative analysis for the sampling methods in all categories.Finally,we discuss some challenges and future research directions of the sampling methods.展开更多
Identifying cancer driver genes has paramount significance in elucidating the intricate mechanisms underlying cancer development,progression,and therapeutic interventions.Abundant omics data and interactome networks p...Identifying cancer driver genes has paramount significance in elucidating the intricate mechanisms underlying cancer development,progression,and therapeutic interventions.Abundant omics data and interactome networks provided by numerous extensive databases enable the application of graph deep learning techniques that incorporate network structures into the deep learning framework.However,most existing models primarily focus on individual network,inevitably neglecting the incompleteness and noise of interactions.Moreover,samples with imbalanced classes in driver gene identification hamper the performance of models.To address this,we propose a novel deep learning framework MMGN,which integrates multiplex networks and pan-cancer multiomics data using graph neural networks combined with negative sample inference to discover cancer driver genes,which not only enhances gene feature learning based on the mutual information and the consensus regularizer,but also achieves balanced class of positive and negative samples for model training.The reliability of MMGN has been verified by the Area Under the Receiver Operating Characteristic curves(AUROC)and the Area Under the Precision-Recall Curves(AUPRC).We believe MMGN has the potential to provide new prospects in precision oncology and may find broader applications in predicting biomarkers for other intricate diseases.展开更多
基金supported by the National Natural Science Foundation of China(61732018,61872335,61802367,61876215)the Strategic Priority Research Program of Chinese Academy of Sciences(XDC05000000)+1 种基金Beijing Academy of Artificial Intelligence(BAAI),the Open Project Program of the State Key Laboratory of Mathematical Engineering and Advanced Computing(2019A07)the Open Project of Zhejiang Laboratory,and a grant from the Institute for Guo Qiang,Tsinghua University.Recommended by Associate Editor Long Chen.
文摘Graph convolutional networks(GCNs)have received significant attention from various research fields due to the excellent performance in learning graph representations.Although GCN performs well compared with other methods,it still faces challenges.Training a GCN model for large-scale graphs in a conventional way requires high computation and storage costs.Therefore,motivated by an urgent need in terms of efficiency and scalability in training GCN,sampling methods have been proposed and achieved a significant effect.In this paper,we categorize sampling methods based on the sampling mechanisms and provide a comprehensive survey of sampling methods for efficient training of GCN.To highlight the characteristics and differences of sampling methods,we present a detailed comparison within each category and further give an overall comparative analysis for the sampling methods in all categories.Finally,we discuss some challenges and future research directions of the sampling methods.
基金supported in part by the National Natural Science Foundation of China(No.62202383)the Guangdong Basic and Applied Basic Research Foundation(No.2024A1515012602)the National Key Research and Development Program of China(No.2022YFD1801200).
文摘Identifying cancer driver genes has paramount significance in elucidating the intricate mechanisms underlying cancer development,progression,and therapeutic interventions.Abundant omics data and interactome networks provided by numerous extensive databases enable the application of graph deep learning techniques that incorporate network structures into the deep learning framework.However,most existing models primarily focus on individual network,inevitably neglecting the incompleteness and noise of interactions.Moreover,samples with imbalanced classes in driver gene identification hamper the performance of models.To address this,we propose a novel deep learning framework MMGN,which integrates multiplex networks and pan-cancer multiomics data using graph neural networks combined with negative sample inference to discover cancer driver genes,which not only enhances gene feature learning based on the mutual information and the consensus regularizer,but also achieves balanced class of positive and negative samples for model training.The reliability of MMGN has been verified by the Area Under the Receiver Operating Characteristic curves(AUROC)and the Area Under the Precision-Recall Curves(AUPRC).We believe MMGN has the potential to provide new prospects in precision oncology and may find broader applications in predicting biomarkers for other intricate diseases.