期刊文献+
共找到22篇文章
< 1 2 >
每页显示 20 50 100
Co-expression network analysis of virulence genes exoS and exoU of pseudomonas aeruginosa in lower respiratory tract based on histological data expression profiles
1
作者 Erli Jiao Bo Chen 《Discussion of Clinical Cases》 2019年第4期10-16,共7页
Objective:To use the gene chip of pseudomonas aeruginosa as a research sample and to explore it at an omics level,aiming at elucidating the co-expression network characteristics of the virulence genes exoS and exoU of... Objective:To use the gene chip of pseudomonas aeruginosa as a research sample and to explore it at an omics level,aiming at elucidating the co-expression network characteristics of the virulence genes exoS and exoU of pseudomonas aeruginosa in the lower respiratory tract from the perspective of molecular biology and identifying its key regulatory genes.Methods:From March 2016 to May 2018,312 patients infected with pseudomonas aeruginosa in the lower respiratory tract who were admitted to Department of Respiratory Medicine of Baogang Hospital and given follow-up treatments in the hospital were selected as subjects by use of cluster sampling.Alveolar lavage fluid and sputum collected from those patients were used as biological specimens.The genes of pseudomonas aeruginosa were detected with the help of oligonucleotide probes to make a pre-processing of chip data.A total of 8 common antibiotics(ceftazidime,gentamicin,piperacillin,amikacin,ciprofloxacin,levofloxacin,doripenem and ticarcillin)against Gram-negative bacteria were selected to determine the drug resistance of biological specimens.MCODE algorithm was used to construct a co-expression network model of the drug-resistance genes focused on exoS/exoU.Results:The expression level of exoS/exoU in the drug-resistance group was significantly higher than that in the non-resistance group(p<0.05).The top 5 differentially expressed genes in the alveolar lavage fluid specimens from the drug-resistance group were RAC1,ITGB1,ITGB5,CRK and IGF1R in the order from high to low.In the sputum specimens,the top 5 differentially expressed genes were RAC1,CRK,IGF1R,ITGB1 and ITGB5.In the alveolar lavage fluid specimens,only RAC1 had a positive correlation with the expression of exoS and exoU(p<0.05).In the sputum specimens,RAC1,ITGB1,ITGB5,CRK and IGF1R were positively correlated with the expression of exoS and exoU(p<0.05).The genes included in the co-expression network contained exoS,exoU,RAC1,ITGB1,ITGB5,CRK,CAMK2D,RHOA,FLNA,IGF1R,TGFBR2 and FOS.Among them,RAC1 had a highest score in the aspect of regulatory ability(72.00)and the largest number of regulatory genes(6);followed by ITGB1,ITGB5 and CRK genes.Conclusions:The high expression of exoS and exoU in the sputum specimens suggests that pseudomonas aeruginosa has a higher probability to get resistant to antibiotics;RAC1,ITGB1,ITGB5 and CRK genes may be the key genes that can regulate the expression of exoS and exoU. 展开更多
关键词 Omics data expression profile Lower respiratory tract Pseudomonas aeruginosa exoS exoU Co-expression network
在线阅读 下载PDF
Prediction of Lung Cancer Stage Using Tumor Gene Expression Data
2
作者 Yadi Gu 《Journal of Cancer Therapy》 2024年第8期287-302,共16页
Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based... Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based models for classifying cancer types using machine learning techniques. By applying Log2 normalization to gene expression data and conducting Wilcoxon rank sum tests, the researchers employed various classifiers and Incremental Feature Selection (IFS) strategies. The study culminated in two optimized models using the XGBoost classifier, comprising 10 and 74 genes respectively. The 10-gene model, due to its simplicity, is proposed for easier clinical implementation, whereas the 74-gene model exhibited superior performance in terms of Specificity, AUC (Area Under the Curve), and Precision. These models were evaluated based on their sensitivity, AUC, and specificity, aiming to achieve high sensitivity and AUC while maintaining reasonable specificity. 展开更多
关键词 Lung Cancer Detection Stage Prediction Gene expression data Xgboost Machine Learning
在线阅读 下载PDF
CHDTEPDB:Transcriptome Expression Profile Database and Interactive Analysis Platform for Congenital Heart Disease 被引量:1
3
作者 Ziguang Song Jiangbo Yu +7 位作者 Mengmeng Wang Weitao Shen Chengcheng Wang Tianyi Lu Gaojun Shan Guo Dong Yiru Wang Jiyi Zhao 《Congenital Heart Disease》 SCIE 2023年第6期693-701,共9页
CHDTEPDB(URL:http://chdtepdb.com/)is a manually integrated database for congenital heart disease(CHD)that stores the expression profiling data of CHD derived from published papers,aiming to provide rich resources for i... CHDTEPDB(URL:http://chdtepdb.com/)is a manually integrated database for congenital heart disease(CHD)that stores the expression profiling data of CHD derived from published papers,aiming to provide rich resources for investigating a deeper correlation between human CHD and aberrant transcriptome expression.The develop-ment of human diseases involves important regulatory roles of RNAs,and expression profiling data can reflect the underlying etiology of inherited diseases.Hence,collecting and compiling expression profiling data is of critical significance for a comprehensive understanding of the mechanisms and functions that underpin genetic diseases.CHDTEPDB stores the expression profiles of over 200 sets of 7 types of CHD and provides users with more convenient basic analytical functions.Due to the differences in clinical indicators such as disease type and unavoidable detection errors among various datasets,users are able to customize their selection of corresponding data for personalized analysis.Moreover,we provide a submission page for researchers to submit their own data so that increasing expression profiles as well as some other histological data could be supplemented to the database.CHDTEPDB is a user-friendly interface that allows users to quickly browse,retrieve,download,and analyze their target samples.CHDTEPDB will significantly improve the current knowledge of expression profiling data in CHD and has the potential to be exploited as an important tool for future research on the disease. 展开更多
关键词 Congenital heart disease(CHD) RNA expression data dataBASE VISUALIZATION
在线阅读 下载PDF
A Novel Soft Clustering Approach for Gene Expression Data
4
作者 E.Kavitha R.Tamilarasan +1 位作者 Arunadevi Baladhandapani M.K.Jayanthi Kannan 《Computer Systems Science & Engineering》 SCIE EI 2022年第12期871-886,共16页
Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode p... Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode proteins which in turn will dictate the cell function. The production of messengerRNA along with processing the same are the two main stages involved in the process of gene expression. The biological networks complexity added with thevolume of data containing imprecision and outliers increases the challenges indealing with them. Clustering methods are hence essential to identify the patternspresent in massive gene data. Many techniques involve hierarchical, partitioning,grid based, density based, model based and soft clustering approaches for dealingwith the gene expression data. Understanding the gene regulation and other usefulinformation from this data can be possible only through effective clustering algorithms. Though many methods are discussed in the literature, we concentrate onproviding a soft clustering approach for analyzing the gene expression data. Thepopulation elements are grouped based on the fuzziness principle and a degree ofmembership is assigned to all the elements. An improved Fuzzy clustering byLocal Approximation of Memberships (FLAME) is proposed in this workwhich overcomes the limitations of the other approaches while dealing with thenon-linear relationships and provide better segregation of biological functions. 展开更多
关键词 REINFORCEMENT MEMBERSHIP CENTROID threshold STATISTICS BIOINFORMATICS gene expression data
在线阅读 下载PDF
A Survey on Acute Leukemia Expression Data Classification Using Ensembles
5
作者 Abdel Nasser H.Zaied Ehab Rushdy Mona Gamal 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1349-1364,共16页
Acute leukemia is an aggressive disease that has high mortality rates worldwide.The error rate can be as high as 40%when classifying acute leukemia into its subtypes.So,there is an urgent need to support hematologists... Acute leukemia is an aggressive disease that has high mortality rates worldwide.The error rate can be as high as 40%when classifying acute leukemia into its subtypes.So,there is an urgent need to support hematologists during the classification process.More than two decades ago,researchers used microarray gene expression data to classify cancer and adopted acute leukemia as a test case.The high classification accuracy they achieved confirmed that it is possible to classify cancer subtypes using microarray gene expression data.Ensemble machine learning is an effective method that combines individual classifiers to classify new samples.Ensemble classifiers are recognized as powerful algorithms with numerous advantages over traditional classifiers.Over the past few decades,researchers have focused a great deal of attention on ensemble classifiers in a wide variety of fields,including but not limited to disease diagnosis,finance,bioinformatics,healthcare,manufacturing,and geography.This paper reviews the recent ensemble classifier approaches utilized for acute leukemia gene expression data classification.Moreover,a framework for classifying acute leukemia gene expression data is proposed.The pairwise correlation gene selection method and the Rotation Forest of Bayesian Networks are both used in this framework.Experimental outcomes show that the classification accuracy achieved by the acute leukemia ensemble classifiers constructed according to the suggested framework is good compared to the classification accuracy achieved in other studies. 展开更多
关键词 LEUKEMIA CLASSIFICATION ENSEMBLE rotation forest pairwise correlation bayesian networks gene expression data MICROARRAY gene selection
在线阅读 下载PDF
Deep Learning Enabled Microarray Gene Expression Classification for Data Science Applications
6
作者 Areej A.Malibari Reem M.Alshehri +5 位作者 Fahd N.Al-Wesabi Noha Negm Mesfer Al Duhayyim Anwer Mustafa Hilal Ishfaq Yaseen Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2022年第11期4277-4290,共14页
In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary cha... In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures. 展开更多
关键词 Bioinformatics data science microarray gene expression data classification deep learning metaheuristics
在线阅读 下载PDF
Data Mining Based on Principal Component Analysis Application to the Nitric Oxide Response in Escherichia coli
7
作者 AiLing Teh Donovan Layton +2 位作者 Daniel R. Hyduke Laura R. Jarboe Derrick K. Rollins Sd 《Journal of Statistical Science and Application》 2014年第1期1-18,共18页
This work evaluates a recently developed multivariate statistical method based on the creation of pseudo or latent variables using principal component analysis (PCA). The application is the data mining of gene expre... This work evaluates a recently developed multivariate statistical method based on the creation of pseudo or latent variables using principal component analysis (PCA). The application is the data mining of gene expression data to find a small subset of the most important genes in a set of thousand or tens of thousands of genes from a relatively small number of experimental runs. The method was previously developed and evaluated on artificially generated data and real data sets. Its evaluations consisted of its ability to rank the genes against known truth in simulated data studies and to identify known important genes in real data studies. The purpose of the work described here is to identify a ranked set of genes in an experimental study and then for a few of the most highly ranked unverified genes, experimentally verify their importance.This method was evaluated using the transcriptional response of Escherichia coli to treatment with four distinct inhibitory compounds: nitric oxide, S-nitrosoglutathione, serine hydroxamate and potassium cyanide. Our analysis identified genes previously recognized in the response to these compounds and also identified new genes.Three of these new genes, ycbR, yJhA and yahN, were found to significantly (p-values〈0.002) affect the sensitivityofE, coli to nitric oxide-mediated growth inhibition. Given that the three genes were not highly ranked in the selected ranked set (RS), these results support strong sensitivity in the ability of the method to successfully identify genes related to challenge by NO and GSNO. This ability to identify genes related to the response to an inhibitory compound is important for engineering tolerance to inhibitory metabolic products, such as biofuels, and utilization of cheap sugar streams, such as biomass-derived sugars or hydrolysate. 展开更多
关键词 data mining principal component analysis (PCA) gene expression data analysis
在线阅读 下载PDF
Gene Expression Data Classification Using Consensus Independent Component Analysis 被引量:7
8
作者 Chun-Hou Zheng De-Shuang Huang +1 位作者 Xiang-Zhen Kong Xing-Ming Zhao 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2008年第2期74-82,共9页
We propose a new method for tumor classification from gene expression data, which mainly contains three steps. Firstly, the original DNA microarray gene expression data are modeled by independent component analysis (... We propose a new method for tumor classification from gene expression data, which mainly contains three steps. Firstly, the original DNA microarray gene expression data are modeled by independent component analysis (ICA). Secondly, the most discriminant eigenassays extracted by ICA are selected by the sequential floating forward selection technique. Finally, support vector machine is used to classify the modeling data. To show the validity of the proposed method, we applied it to classify three DNA microarray datasets involving various human normal and tumor tissue samples. The experimental results show that the method is efficient and feasible. 展开更多
关键词 independent component analysis feature selection support vector machine gene expression data
原文传递
Outlier Analysis for Gene Expression Data 被引量:3
9
作者 ChaoYan Guo-LiangChen Yi-FeiShen 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第1期13-21,共9页
The rapid developments of technologies that generate arrays of gene dataenable a global view of the transcription levels of hundreds of thousands of genes simultaneously.The outlier detection problem for gene data has... The rapid developments of technologies that generate arrays of gene dataenable a global view of the transcription levels of hundreds of thousands of genes simultaneously.The outlier detection problem for gene data has its importance but together with the difficulty ofhigh dimensionality. The sparsity of data in high-dimensional space makes each point a relativelygood outlier in the view of traditional distance-based definitions. Thus, finding outliers in highdimensional data is more complex. In this paper, some basic outlier analysis algorithms arediscussed and a new genetic algorithm is presented. This algorithm is to find best dimensionprojections based on a revised cell-based algorithm and to give explanations to solutions. It cansolve the outlier detection problem for gene expression data and for other high dimensional data aswell. 展开更多
关键词 gene expression data outlier analysis cell-based algorithm GENETICALGORITHM
原文传递
Mining and Integrating Reliable Decision Rules for Imbalanced Cancer Gene Expression Data Sets 被引量:4
10
作者 Hualong Yu 1 , Jun Ni 2 , Yuanyuan Dan 3 , Sen Xu 4 1. School of Computer Science and Engineering, Jiangsu University of Science and Technology, Zhenjiang 212003, China +2 位作者 2. Department of Radiology, Carver College of Medicine, The University of Iowa, Iowa City, IA 52242, USA 3. School of Biology and Chemical Engineering, Jiangsu University of Science and Technology, Zhenjiang 212003, China 4. School of Information Engineering, Yancheng Institute of Technology, Yancheng 224051, China 《Tsinghua Science and Technology》 SCIE EI CAS 2012年第6期666-673,共8页
There have been many skewed cancer gene expression datasets in the post-genomic era. Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms ... There have been many skewed cancer gene expression datasets in the post-genomic era. Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms will seriously underestimate the performance of the minority class, leading to inaccurate diagnosis in clinical trails. This paper presents a skewed gene selection algorithm that introduces a weighted metric into the gene selection procedure. The extracted genes are paired as decision rules to distinguish both classes, with these decision rules then integrated into an ensemble learning framework by majority voting to recognize test examples; thus avoiding tedious data normalization and classifier construction. The mining and integrating of a few reliable decision rules gave higher or at least comparable classification performance than many traditional class imbalance learning algorithms on four benchmark imbalanced cancer gene expression datasets. 展开更多
关键词 cancer gene expression data class imbalance paired differential expression genes decision ruleensemble learning majority voting
原文传递
Constrained query of order-preserving submatrix in gene expression data 被引量:2
11
作者 Tao JIANG Zhanhuai LI +3 位作者 Xuequn SHANG Bolin CHEN Weibang LI Zhilei YIN 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第6期1052-1066,共15页
Order-preserving submatrix (OPSM) has become important in modelling biologically meaningful subspace cluster, capturing the general tendency of gene expressions across a subset of conditions. With the advance of mic... Order-preserving submatrix (OPSM) has become important in modelling biologically meaningful subspace cluster, capturing the general tendency of gene expressions across a subset of conditions. With the advance of microarray and analysis techniques, big volume of gene expression datasets and OPSM mining results are produced. OPSM query can efficiently retrieve relevant OPSMs from the huge amount of OPSM datasets. However, improving OPSM query relevancy remains a difficult task in real life exploratory data analysis processing. First, it is hard to capture subjective interestingness aspects, e.g., the analyst's expectation given her/his domain knowledge. Second, when these expectations can be declaratively specified, it is still challenging to use them during the computational process of OPSM queries. With the best of our knowledge, existing methods mainly fo- cus on batch OPSM mining, while few works involve OPSM query. To solve the above problems, the paper proposes two constrained OPSM query methods, which exploit userdefined constraints to search relevant results from two kinds of indices introduced. In this paper, extensive experiments are conducted on real datasets, and experiment results demonstrate that the multi-dimension index (cIndex) and enumerating sequence index (esIndex) based queries have better performance than brute force search. 展开更多
关键词 gene expression data OPSM constrained query brute-force search feature sequence cIndex
原文传递
Cancer classification based on microarray gene expression data using a principal component accumulation method 被引量:2
12
作者 LIU JingJing CAI WenSheng SHAO XueGuang 《Science China Chemistry》 SCIE EI CAS 2011年第5期802-811,共10页
The classification of cancer is a major research topic in bioinformatics. The nature of high dimensionality and small size associated with gene expression data,however,makes the classification quite challenging. Altho... The classification of cancer is a major research topic in bioinformatics. The nature of high dimensionality and small size associated with gene expression data,however,makes the classification quite challenging. Although principal component analysis (PCA) is of particular interest for the high-dimensional data,it may overemphasize some aspects and ignore some other important information contained in the richly complex data,because it displays only the difference in the first twoor three-dimensional PC subspaces. Based on PCA,a principal component accumulation (PCAcc) method was proposed. It employs the information contained in multiple PC subspaces and improves the class separability of cancers. The effectiveness of the present method was evaluated by four commonly used gene expression datasets,and the results show that the method performs well for cancer classification. 展开更多
关键词 cancer classification principal component analysis principal component accumulation gene expression data
原文传递
Applying Intelligent Computing Techniques to Modeling Biological Networks from Expression Data 被引量:1
13
作者 Wei-Po Lee Kung-Cheng Yang 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2008年第2期111-120,共10页
Constructing biological networks is one of the most important issues in systems biology. However, constructing a network from data manually takes a considerable large amount of time, therefore an automated procedure i... Constructing biological networks is one of the most important issues in systems biology. However, constructing a network from data manually takes a considerable large amount of time, therefore an automated procedure is advocated. To automate the procedure of network construction, in this work we use two intelligent computing techniques, genetic programming and neural computation, to infer two kinds of network models that use continuous variables. To verify the presented approaches, experiments have been conducted and the preliminary results show that both approaches can be used to infer networks successfully. 展开更多
关键词 reverse engineering system modeling genetic programming recurrent neural network expression data
原文传递
Identification ACTA2 and KDR as key proteins for prognosis of PD-1/PD-L1 blockade therapy in melanoma 被引量:2
14
作者 Yuchen Wang Zhaojun Li +1 位作者 Zhihui Zhang Xiaoguang Chen 《Animal Models and Experimental Medicine》 CSCD 2021年第2期138-150,共13页
Programmed cell death protein 1(PD-1)/programmed cell death ligand 1(PD-L1)blockade is an important therapeutic strategy for melanoma,despite its low clinical response.It is important to identify genes and pathways th... Programmed cell death protein 1(PD-1)/programmed cell death ligand 1(PD-L1)blockade is an important therapeutic strategy for melanoma,despite its low clinical response.It is important to identify genes and pathways that may reflect the clinical outcomes of this therapy in patients.We analyzed clinical dataset GSE96619,which contains clinical information from five melanoma patients before and after anti-PD-1 therapy(five pairs of data).We identified 704 DEGs using these five pairs of data,and then the number of DEGs was narrowed down to 286 in patients who responded to treatment.Next,we performed KEGG pathway enrichment and constructed a DEG-associated protein-protein interaction network.Smooth muscle actin 2(ACTA2)and tyrosine kinase growth factor receptor(KDR)were identified as the hub genes,which were significantly downregulated in the tumor tissue of the two patients who re-sponded to treatment.To confirm our analysis,we demonstrated similar expression tendency to the clinical data for the two hub genes in a B16F10 subcutaneous xeno-graft model.This study demonstrates that ACTA2 and KDR are valuable responsive markers for PD-1/PD-L1 blockade therapy. 展开更多
关键词 expression profiling data hub genes MELANOMA PD-1/PD-L1 blockade therapy
在线阅读 下载PDF
An Algorithm of Programming Data Flow Analysis Based on Data Flow Expression
15
作者 Zhao Dongfan, Li Wei and Meng Qingkai (Department of Computer Engineering, Changchun Institute of Post and Telecommunication, Changchun 130012, P. R. China) 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 1998年第1期41-42,共2页
This paper states the basic principle of program data flow analysis in a formal way and gives the concept of data flow expression. On the basis of this concept, an algorithm of finding data flow exceptions is rendered... This paper states the basic principle of program data flow analysis in a formal way and gives the concept of data flow expression. On the basis of this concept, an algorithm of finding data flow exceptions is rendered. This algorithm has great generality, with which it is easy to develop a tool for program test. So it is practical in application. 展开更多
关键词 software test program analysis data flow analysis data flow expression
原文传递
PCA-FA:Applying Supervised Learning to Analyze Gene Expression Data
16
作者 翁时锋 张长水 张学工 《Tsinghua Science and Technology》 SCIE EI CAS 2004年第4期428-434,共7页
In previous gene expression data analyses, supervised learning has mainly focused on the clas-sification of attribute data, such as the different experimental conditions, different known classes of the same tumor and ... In previous gene expression data analyses, supervised learning has mainly focused on the clas-sification of attribute data, such as the different experimental conditions, different known classes of the same tumor and sex. However, supervised learning classification is not suitable for interval-scaled attributes, such as age and survival outcome of cancer patients. For this problem, this paper proposed a new method by combining two well-known methods: principal component analysis (PCA) and Fisher analysis (FA). The method, PCA-FA, realizes supervised learning with two types of attributes (nominal attributes and interval-scaled attributes). The fuzzy FA was introduced to model the interval-scaled attributes. In this paper, an ap-proximate linear relationship between gene expression data of lung adenocarcinoma patients and survival outcome is successfully revealed by PCA-TA. 展开更多
关键词 supervised learning gene expression data principal component analysis Fisher analysis
原文传递
Correlating Expression Data with Gene Function Using Gene Ontology
17
作者 刘琪 邓勇 +2 位作者 王川 石铁流 李亦学 《Chinese Journal of Chemistry》 SCIE CAS CSCD 2006年第9期1247-1254,共8页
Clustering is perhaps one of the most widely used tools for microarray data analysis. Proposed roles for genes of unknown function are inferred from clusters of genes similarity expressed across many biological condit... Clustering is perhaps one of the most widely used tools for microarray data analysis. Proposed roles for genes of unknown function are inferred from clusters of genes similarity expressed across many biological conditions. However, whether function annotation by similarity metrics is reliable or not and to what extent the similarity in gene expression patterns is useful for annotation of gene functions, has not been evaluated. This paper made a comprehensive research on the correlation between the similarity of expression data and of gene functions using Gene Ontology. It has been found that although the similarity in expression patterns and the similarity in gene functions are significantly dependent on each other, this association is rather weak. In addition, among the three categories of Gene Ontology, the similarity of expression data is more useful for cellular component annotation than for biological process and molecular function. The results presented are interesting for the gene functions prediction research area. 展开更多
关键词 microarray data gene ontology similarity of expression data function annotation
原文传递
Identification of key genes and biological pathways in lung adenocarcinoma by integrated bioinformatics analysis
18
作者 Lin Zhang Yuan Liu +4 位作者 Jian-Guo Zhuang Jie Guo Yan-Tao Li Yan Dong Gang Song 《World Journal of Clinical Cases》 SCIE 2023年第23期5504-5518,共15页
BACKGROUND The objectives of this study were to identify hub genes and biological pathways involved in lung adenocarcinoma(LUAD)via bioinformatics analysis,and investigate potential therapeutic targets.AIM To determin... BACKGROUND The objectives of this study were to identify hub genes and biological pathways involved in lung adenocarcinoma(LUAD)via bioinformatics analysis,and investigate potential therapeutic targets.AIM To determine reliable prognostic biomarkers for early diagnosis and treatment of LUAD.METHODS To identify potential therapeutic targets for LUAD,two microarray datasets derived from the Gene Expression Omnibus(GEO)database were analyzed,GSE3116959 and GSE118370.Differentially expressed genes(DEGs)in LUAD and normal tissues were identified using the GEO2R tool.The Hiplot database was then used to generate a volcanic map of the DEGs.Weighted gene co-expression network analysis was conducted to cluster the genes in GSE116959 and GSE-118370 into different modules,and identify immune genes shared between them.A protein-protein interaction network was established using the Search Tool for the Retrieval of Interacting Genes database,then the CytoNCA and CytoHubba components of Cytoscape software were used to visualize the genes.Hub genes with high scores and co-expression were identified,and the Database for Annotation,Visualization and Integrated Discovery was used to perform enrichment analysis of these genes.The diagnostic and prognostic values of the hub genes were calculated using receiver operating characteristic curves and Kaplan-Meier survival analysis,and gene-set enrichment analysis was conducted.The University of Alabama at Birmingham Cancer data analysis portal was used to analyze relationships between the hub genes and normal specimens,as well as their expression during tumor progression.Lastly,validation of protein expression was conducted on the identified hub genes via the Human Protein Atlas database.RESULTS Three hub genes with high connectivity were identified;cellular retinoic acid binding protein 2(CRABP2),matrix metallopeptidase 12(MMP12),and DNA topoisomerase II alpha(TOP2A).High expression of these genes was associated with a poor LUAD prognosis,and the genes exhibited high diagnostic value.CONCLUSION Expression levels of CRABP2,MMP12,and TOP2A in LUAD were higher than those in normal lung tissue.This observation has diagnostic value,and is linked to poor LUAD prognosis.These genes may be biomarkers and therapeutic targets in LUAD,but further research is warranted to investigate their usefulness in these respects. 展开更多
关键词 Cellular retinoic acid binding protein 2 expression profiling data Hub genes Lung adenocarcinoma Matrix metallopeptidase 12 Topoisomerase II alpha
在线阅读 下载PDF
利用数据场的表情脸识别方法 被引量:10
19
作者 王树良 邹珊珊 +1 位作者 操保华 谢媛 《武汉大学学报(信息科学版)》 EI CSCD 北大核心 2010年第6期738-742,共5页
提出了一种利用数据场的表情脸识别方法。根据数据场的思想层次建模,从数据中提取概念,用特征集来表示概念。在JAFFE表情脸图像库中进行实验,其整体识别率高达94.3%,说明该方法能够较有效地处理表情脸识别中的不确定性。
关键词 数据场 模式识别 数据聚类 表情脸识别
原文传递
A survey of malware behavior description and analysis 被引量:5
20
作者 Bo YU Ying FANG +2 位作者 Qiang YANG Yong TANG Liu LIU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2018年第5期583-603,共21页
Behavior-based malware analysis is an important technique for automatically analyzing and detecting malware, and it has received considerable attention from both academic and industrial communities. By considering how... Behavior-based malware analysis is an important technique for automatically analyzing and detecting malware, and it has received considerable attention from both academic and industrial communities. By considering how malware behaves, we can tackle the malware obfuscation problem, which cannot be processed by traditional static analysis approaches, and we can also derive the as-built behavior specifications and cover the entire behavior space of the malware samples. Although there have been several works focusing on malware behavior analysis, such research is far from mature, and no overviews have been put forward to date to investigate current developments and challenges. In this paper, we conduct a survey on malware behavior description and analysis considering three aspects: malware behavior description, behavior analysis methods, and visualization techniques. First, existing behavior data types and emerging techniques for malware behavior description are explored, especially the goals, prin- ciples, characteristics, and classifications of behavior analysis techniques proposed in the existing approaches. Second, the in- adequacies and challenges in malware behavior analysis are summarized from different perspectives. Finally, several possible directions are discussed for future research. 展开更多
关键词 Malware behavior Static analysis Dynamic Analysis Behavior data expression Behavior analysis MACHINELEARNING Semantics-based analysis Behavior visualization Malware evolution
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部