Objectives To identify core symptoms and symptom clusters in patients with neuromyelitis optica spectrum disorder(NMOSD)by network analysis.Methods From October 10 to 30,2023,140 patients with NMOSD were selected to p...Objectives To identify core symptoms and symptom clusters in patients with neuromyelitis optica spectrum disorder(NMOSD)by network analysis.Methods From October 10 to 30,2023,140 patients with NMOSD were selected to participate in this online questionnaire survey.The survey tools included a general information questionnaire and a self-made NMOSD symptoms scale,which included the prevalence,severity,and distress of 29 symptoms.Cluster analysis was used to identify symptom clusters,and network analysis was used to analyze the symptom network and node characteristics and central indicators including strength centrality(r_(s)),closeness centrality(r_(c))and betweeness centrality(r_(b))were used to identify core symptoms and symptom clusters.Results The most common symptom was pain(65.7%),followed by paraesthesia(65.0%),fatigue(65.0%),easy awakening(63.6%).Regarding the burden level of symptoms,pain was the most burdensome symptom,followed by paraesthesia,easy awakening,fatigue,and difficulty falling asleep.Six clusters were identified:somatosensory,motor,visual,and memory symptom clusters,bladder and rectum symptom clusters,sleep symptoms clusters,and neuropsychological symptom clusters.Fatigue(r_(s)=12.39,r_(b)=68.00,r_(c)=0.02)was the most central and prominent bridge symptom,and motor symptom cluster(r_(s)=2.68,r_(c)=0.10)was the most central symptom cluster among the six clusters.Conclusions Our study demonstrated the necessity of symptom management targeting fatigue,pain,and motor symptom cluster in patients with NMOSD.展开更多
Failure mode and effect analysis(FMEA)is a preven-tative risk evaluation method used to evaluate and eliminate fail-ure modes within a system.However,the traditional FMEA method exhibits many deficiencies that pose ch...Failure mode and effect analysis(FMEA)is a preven-tative risk evaluation method used to evaluate and eliminate fail-ure modes within a system.However,the traditional FMEA method exhibits many deficiencies that pose challenges in prac-tical applications.To improve the conventional FMEA,many modified FMEA models have been suggested.However,the majority of them inadequately address consensus issues and focus on achieving a complete ranking of failure modes.In this research,we propose a new FMEA approach that integrates a two-stage consensus reaching model and a density peak clus-tering algorithm for the assessment and clustering of failure modes.Firstly,we employ the interval 2-tuple linguistic vari-ables(I2TLVs)to express the uncertain risk evaluations provided by FMEA experts.Then,a two-stage consensus reaching model is adopted to enable FMEA experts to reach a consensus.Next,failure modes are categorized into several risk clusters using a density peak clustering algorithm.Finally,the proposed FMEA is illustrated by a case study of load-bearing guidance devices of subway systems.The results show that the proposed FMEA model can more easily to describe the uncertain risk information of failure modes by using the I2TLVs;the introduction of an endogenous feedback mechanism and an exogenous feedback mechanism can accelerate the process of consensus reaching;and the density peak clustering of failure modes successfully improves the practical applicability of FMEA.展开更多
Objective To improve the efficiency of patent clustering related to COVID-19 through the topic extraction algorithm and BERT model,and to help researchers understand the patent applications for novel corona virus.Meth...Objective To improve the efficiency of patent clustering related to COVID-19 through the topic extraction algorithm and BERT model,and to help researchers understand the patent applications for novel corona virus.Methods The weights of topic vector and BERT model vector were adjusted by cross-entropy loss algorithm to obtain joint vector.Then,k-means++algorithm was used for patent clustering after dimension reduction.Results and Conclusion The model was applied to patents for corona virus drugs,and five clustering topics were generated.Through comparison,it is proved that the clustering results of this model are more centralized and the differentiation between clusters is significant.The five clusters generated are visually analyzed to reveal the development status of patents for corona virus drugs.展开更多
Inter-simple sequence repeat(ISSR) molecular markers were applied to analyze the genetic diversity and clustering of 48 introduced and bred cultivars of Olea euyopaea L. Totally 106 DNA bands were amplified by 11 sc...Inter-simple sequence repeat(ISSR) molecular markers were applied to analyze the genetic diversity and clustering of 48 introduced and bred cultivars of Olea euyopaea L. Totally 106 DNA bands were amplified by 11 screened primers, including 99 polymorphic bands; the percentage of polymorphic loci was 93.40%, indicating a rich genetic diversity in Olea euyopaea L. germplasm resources. Based on Nei's genetic distances between various cultivars, a dendrogram of 48 cultivars of Olea euyopaea L. was constructed using unweighted pair-group(UPMGA)method,which showed that 48 cultivars were clustered into four main categories; 84.6% of native cultivars were clustered into two categories; most of introduced cultivars were clustered based on their sources and main usages but not on their geographic origins. This study will provide references for the utilization and further genetic improvement of Olea euyopaea L. germplasm resources.展开更多
In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising...In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising data based on a semantic description in coal mines is studied.First,the semantic and numerical-based hybrid description method of security supervising data in coal mines is described.Secondly,the similarity measurement method of semantic and numerical data are separately given and a weight-based hybrid similarity measurement method for the security supervising data based on a semantic description in coal mines is presented.Thirdly,taking the hybrid similarity measurement method as the distance criteria and using a grid methodology for reference,an improved CURE clustering algorithm based on the grid is presented.Finally,the simulation results of a security supervising data set in coal mines validate the efficiency of the algorithm.展开更多
[Objective] This study aimed to develop ACGM markers for the clustering analysis of large grained Brassica napus materials. [Method] A total of 44 pairs of ACGM primers were designed according to 18 genes related to A...[Objective] This study aimed to develop ACGM markers for the clustering analysis of large grained Brassica napus materials. [Method] A total of 44 pairs of ACGM primers were designed according to 18 genes related to Arabidopsis grain development and their homologous rape EST sequences. After electrophoresis, 18 pairs of ACGM primers were selected for the clustering analysis of 16 larger grained samples and four fine grained samples of rapeseed. [Result] PCR result showed that 2-6 specific bands were respectively amplified by each pair of primes, and all the bands were polymorphic and repeatable, suggesting that the optimized ACGM markers were useful for clustering analysis of B. napus species. Clustering analysis revealed that the 20 rapeseed samples were divided into three clusters A, B, and C at similarity coefficient 0.6. Then, the clusters A and B were further divided into five sub clusters A1, A2, A3, B1 and B2 at similarity coefficient 0.67. [Conclusion] This study will provide theoretical and practical values for rape breeding.展开更多
[Objective] This study aimed to investigate the trace elements in Rehman- nia glutinosa Libosch. by using principal component analysis and clustering analysis. [Method] Principal component analysis and clustering anal...[Objective] This study aimed to investigate the trace elements in Rehman- nia glutinosa Libosch. by using principal component analysis and clustering analysis. [Method] Principal component analysis and clustering analysis of R. glutinosa medicinal materials from different sources were conducted with contents of six trace elements as indices. [Result] The principal component analysis could comprehen- sively evaluate the quality of R. glutinosa samples with objective results which was consistent with the results of clustering analysis. [Conclusion] Principal component analysis and clustering analysis methods can be used for the quality evaluation of Chinese medicinal materials with multiple indices.展开更多
The characterization and clustering of rock discontinuity sets are a crucial and challenging task in rock mechanics and geotechnical engineering.Over the past few decades,the clustering of discontinuity sets has under...The characterization and clustering of rock discontinuity sets are a crucial and challenging task in rock mechanics and geotechnical engineering.Over the past few decades,the clustering of discontinuity sets has undergone rapid and remarkable development.However,there is no relevant literature summarizing these achievements,and this paper attempts to elaborate on the current status and prospects in this field.Specifically,this review aims to discuss the development process of clustering methods for discontinuity sets and the state-of-the-art relevant algorithms.First,we introduce the importance of discontinuity clustering analysis and follow the comprehensive characterization approaches of discontinuity data.A bibliometric analysis is subsequently conducted to clarify the current status and development characteristics of the clustering of discontinuity sets.The methods for the clustering analysis of rock discontinuities are reviewed in terms of single-and multi-parameter clustering methods.Single-parameter methods can be classified into empirical judgment methods,dynamic clustering methods,relative static clustering methods,and static clustering methods,reflecting the continuous optimization and improvement of clustering algorithms.Moreover,this paper compares the current mainstream of single-parameter clustering methods with multi-parameter clustering methods.It is emphasized that the current single-parameter clustering methods have reached their performance limits,with little room for improvement,and that there is a need to extend the study of multi-parameter clustering methods.Finally,several suggestions are offered for future research on the clustering of discontinuity sets.展开更多
Pseudoalteromonas is a group of marine bacteria widespread in diverse marine sediments,producing a wide range of bioactive compounds.However,only a limited number of Pseudoalteromonas phages have been isolated and stu...Pseudoalteromonas is a group of marine bacteria widespread in diverse marine sediments,producing a wide range of bioactive compounds.However,only a limited number of Pseudoalteromonas phages have been isolated and studied.In this study,a novel lytic Pseudoalteromonas phage,denoted as vB_PalP_Y7,was isolated from sewage samples collected at the Seafood Market in Qingdao,China.vB_PalP_Y7 remained stable across a wide range of temperatures(-20–50℃)and a wide pH range(3–12).The vB_PalP_Y7 phage harbors a linear double-stranded DNA molecule of 57699 base pairs(bp)with a G+C content of 45.90%.Furthermore,it is predicted to contain 58 open reading frames(ORFs).Phylogenetic analysis and protein network relationship analysis revealed low similarity between vB_PalP_Y7 and viruses in the ICTV and IMG/VR4 database,suggesting that vB_PalP_Y7 may be a potential new genus,Miuvirus.This study contributed valuable insights to comprehend the relationship between Pseudoalteromonas phages and their host organisms.展开更多
Efficient iterative unsupervised machine learning involving probabilistic clustering analysis with the expectation-maximization(EM)clustering algorithm is applied to categorize reservoir facies by exploiting latent an...Efficient iterative unsupervised machine learning involving probabilistic clustering analysis with the expectation-maximization(EM)clustering algorithm is applied to categorize reservoir facies by exploiting latent and observable well-log variables from a clastic reservoir in the Majnoon oilfield,southern Iraq.The observable well-log variables consist of conventional open-hole,well-log data and the computer-processed interpretation of gamma rays,bulk density,neutron porosity,compressional sonic,deep resistivity,shale volume,total porosity,and water saturation,from three wells located in the Nahr Umr reservoir.The latent variables include shale volume and water saturation.The EM algorithm efficiently characterizes electrofacies through iterative machine learning to identify the local maximum likelihood estimates(MLE)of the observable and latent variables in the studied dataset.The optimized EM model developed successfully predicts the core-derived facies classification in two of the studied wells.The EM model clusters the data into three distinctive reservoir electrofacies(F1,F2,and F3).F1 represents a gas-bearing electrofacies with low shale volume(Vsh)and water saturation(Sw)and high porosity and permeability values identifying it as an attractive reservoir target.The results of the EM model are validated using nuclear magnetic resonance(NMR)data from the third studied well for which no cores were recovered.The NMR results confirm the effectiveness and accuracy of the EM model in predicting electrofacies.The utilization of the EM algorithm for electrofacies classification/cluster analysis is innovative.Specifically,the clusters it establishes are less rigidly constrained than those derived from the more commonly used K-means clustering method.The EM methodology developed generates dependable electrofacies estimates in the studied reservoir intervals where core samples are not available.Therefore,once calibrated with core data in some wells,the model is suitable for application to other wells that lack core data.展开更多
A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in vari...A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in various ways, but most often they are based on previous landslide data. This approach introduces several limitations. For instance, there is a requirement for the location to have been previously monitored in some way to have this type of information recorded. Another significant limitation is the need for information regarding the location and timing of incidents. Despite the current ease of obtaining location information (GPS, drone images, etc.), the timing of the event remains challenging to ascertain for a considerable portion of landslide data. Concerning rainfall monitoring, there are multiple ways to consider it, for instance, examining accumulations over various intervals (1 h, 6 h, 24 h, 72 h), as well as in the calculation of effective rainfall, which represents the precipitation that actually infiltrates the soil. However, in the vast majority of cases, both the thresholds and the rain monitoring approach are defined manually and subjectively, relying on the operators’ experience. This makes the process labor-intensive and time-consuming, hindering the establishment of a truly standardized and rapidly scalable methodology on a large scale. In this work, we propose a Landslides Early Warning System (LEWS) based on the concept of rainfall half-life and the determination of thresholds using Cluster Analysis and data inversion. The system is designed to be applied in extensive monitoring networks, such as the one utilized by Cemaden, Brazil’s National Center for Monitoring and Early Warning of Natural Disasters.展开更多
This paper investigates the design essence of Chinese classical private gardens,integrating their design elements and fundamental principles.It systematically analyzes the unique characteristics and differences among ...This paper investigates the design essence of Chinese classical private gardens,integrating their design elements and fundamental principles.It systematically analyzes the unique characteristics and differences among classical private gardens in the Northern,Jiangnan,and Lingnan regions.The study examines nine classical private gardens from Northern China,Jiangnan,and Lingnan by utilizing the advanced tool of principal component cluster analysis.Based on literature analysis and field research,273 variables were selected for principal component analysis,from which four components with higher contribution rates were chosen for further study.Subsequently,we employed clustering analysis techniques to compare the differences among the three types of gardens.The results reveal that the first principal component effectively highlights the differences between Jiangnan and Lingnan private gardens.The second principal component serves as the key to defining the types of Northern private gardens and distinguishing them from the other two types,and the third principal component indicates that Lingnan private gardens can be categorized into two distinct types as well.展开更多
Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algor...Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.展开更多
Remarkable progress has been made in infection prevention and control(IPC)in many countries,but some gaps emerged in the context of the coronavirus disease 2019(COVID-19)pandemic.Core capabilities such as standard cli...Remarkable progress has been made in infection prevention and control(IPC)in many countries,but some gaps emerged in the context of the coronavirus disease 2019(COVID-19)pandemic.Core capabilities such as standard clinical precautions and tracing the source of infection were the focus of IPC in medical institutions during the pandemic.Therefore,the core competences of IPC professionals during the pandemic,and how these contributed to successful prevention and control of the epidemic,should be studied.To investigate,using a systematic review and cluster analysis,fundamental improvements in the competences of infection control and prevention professionals that may be emphasized in light of the COVID-19 pandemic.We searched the PubMed,Embase,Cochrane Library,Web of Science,CNKI,WanFang Data,and CBM databases for original articles exploring core competencies of IPC professionals during the COVID-19 pandemic(from January 1,2020 to February 7,2023).Weiciyun software was used for data extraction and the Donohue formula was followed to distinguish high-frequency technical terms.Cluster analysis was performed using the within-group linkage method and squared Euclidean distance as the metric to determine the priority competencies for development.We identified 46 studies with 29 high-frequency technical terms.The most common term was“infection prevention and control training”(184 times,17.3%),followed by“hand hygiene”(172 times,16.2%).“Infection prevention and control in clinical practice”was the most-reported core competency(367 times,34.5%),followed by“microbiology and surveillance”(292 times,27.5%).Cluster analysis showed two key areas of competence:Category 1(program management and leadership,patient safety and occupational health,education and microbiology and surveillance)and Category 2(IPC in clinical practice).During the COVID-19 pandemic,IPC program management and leadership,microbiology and surveillance,education,patient safety,and occupational health were the most important focus of development and should be given due consideration by IPC professionals.展开更多
The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and ...The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils.展开更多
Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi c...Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi cation in the application of this technology.This paper introduces a spectral clustering technique for unsupervised seismic facies analysis.This algorithm is based on on the idea of a graph to cluster the data.Its kem is that seismic data are regarded as points in space,points can be connected with the edge and construct to graphs.When the graphs are divided,the weights of the edges between the different subgraphs are as low as possible,whereas the weights of the inner edges of the subgraph should be as high as possible.That has high computational complexity and entails large memory consumption for spectral clustering algorithm.To solve the problem this paper introduces the idea of sparse representation into spectral clustering.Through the selection of a small number of local sparse representation points,the spectral clustering matrix of all sample points is approximately represented to reduce the cost of spectral clustering operation.Verifi cation of physical model and fi eld data shows that the proposed approach can obtain more accurate seismic facies classification results without considering the data meet any hypothesis.The computing efficiency of this new method is better than that of the conventional spectral clustering method,thereby meeting the application needs of fi eld seismic data.展开更多
Five factors expressing greenbelt quality and one factor expressing quantity were adopted for evaluation of the residential greenbelt, and the AHP (Analytical Hierarchy Process) method was used to determine the valu...Five factors expressing greenbelt quality and one factor expressing quantity were adopted for evaluation of the residential greenbelt, and the AHP (Analytical Hierarchy Process) method was used to determine the value of factors. Thirty residential areas were selected as the samples. Two principal components were extracted and their expression was constructed by method of factor anlysis, therefore, quality evaluation of residential greenbelt was obtained. The accuracy of the function and implement quality classification toward the residential greenbelts in Xinxiang City were validated by clustering analysis method. The results showed that the greenbelt quality of fourteen residential areas was higher than the average level, of which eleven were newly-built residential areas. The 30 residential areas were classified into three types according to their greenbelt features and their formation by clustering analysis method. Finally rational proposal basing on aforesaid evaluating results was proposed for construction and renewal of residential greenbelt, upon which directive basis was provided for construction and renewal of residential greenbelt.展开更多
An evaluation index is a prerequisite for the scientific evaluation of a public meteorological service.This paper aims to explore a technical method for determining and screening evaluation indicators.Based on public ...An evaluation index is a prerequisite for the scientific evaluation of a public meteorological service.This paper aims to explore a technical method for determining and screening evaluation indicators.Based on public satisfaction survey data obtained in Wafangdian,China in 2010,this study investigates the suitability of fuzzy clustering analysis method in establishing an evaluation index.Through quantitative analysis of multilayer fuzzy clustering of various evaluation indicators,correlation analysis indicates that if the results of clustering were identical for two evaluation indicators in the same sub-evaluation layer,then one indicator could be removed,or the two indicators merged.For evaluation indicators in different sub-evaluation layers,although clustering reveals attribute correlations,these indicators may not be substituted for one another.Analysis of the applicability of the fuzzy clustering method shows that it plays a certain role in the establishment and correction of an evaluation index.展开更多
Effective storage,processing and analyzing of power device condition monitoring data faces enormous challenges.A framework is proposed that can support both MapReduce and Graph for massive monitoring data analysis at ...Effective storage,processing and analyzing of power device condition monitoring data faces enormous challenges.A framework is proposed that can support both MapReduce and Graph for massive monitoring data analysis at the same time based on Aliyun DTplus platform.First,power device condition monitoring data storage based on MaxCompute table and parallel permutation entropy feature extraction based on MaxCompute MapReduce are designed and implemented on DTplus platform.Then,Graph based k-means algorithm is implemented and used for massive condition monitoring data clustering analysis.Finally,performance tests are performed to compare the execution time between serial program and parallel program.Performance is analyzed from CPU cores consumption,memory utilization and parallel granularity.Experimental results show that the designed framework and parallel algorithms can efficiently process massive power device condition monitoring data.展开更多
A novel multivariate similarity clustering analysis (MSCA) approach was used to estimate a biogeographical division scheme for the global terrestrial fauna and was compared against other widely used clustering algorit...A novel multivariate similarity clustering analysis (MSCA) approach was used to estimate a biogeographical division scheme for the global terrestrial fauna and was compared against other widely used clustering algorithms. The faunal dataset included almost all terrestrial and freshwater fauna, a total of 4631 families, 141,814 genera, and 1,334,834 species. Our findings demonstrated that suitable results were only obtained with the MSCA method, which was associated with distinct hierarchies, reasonable structuring, and furthermore, conformed to biogeographical criteria. A total of seven kingdoms and 20 sub-kingdoms were identified. We discovered that the clustering results for the higher and lower animals did not differ significantly, leading us to consider that the analysis result is convincing as the first zoogeographical division scheme for global all terrestrial animals.展开更多
基金supported by the Specific Research Fund for Top-notch Talents of Guangdong Provincial Hospital of Chinese Medicine(No.2022KT1188).
文摘Objectives To identify core symptoms and symptom clusters in patients with neuromyelitis optica spectrum disorder(NMOSD)by network analysis.Methods From October 10 to 30,2023,140 patients with NMOSD were selected to participate in this online questionnaire survey.The survey tools included a general information questionnaire and a self-made NMOSD symptoms scale,which included the prevalence,severity,and distress of 29 symptoms.Cluster analysis was used to identify symptom clusters,and network analysis was used to analyze the symptom network and node characteristics and central indicators including strength centrality(r_(s)),closeness centrality(r_(c))and betweeness centrality(r_(b))were used to identify core symptoms and symptom clusters.Results The most common symptom was pain(65.7%),followed by paraesthesia(65.0%),fatigue(65.0%),easy awakening(63.6%).Regarding the burden level of symptoms,pain was the most burdensome symptom,followed by paraesthesia,easy awakening,fatigue,and difficulty falling asleep.Six clusters were identified:somatosensory,motor,visual,and memory symptom clusters,bladder and rectum symptom clusters,sleep symptoms clusters,and neuropsychological symptom clusters.Fatigue(r_(s)=12.39,r_(b)=68.00,r_(c)=0.02)was the most central and prominent bridge symptom,and motor symptom cluster(r_(s)=2.68,r_(c)=0.10)was the most central symptom cluster among the six clusters.Conclusions Our study demonstrated the necessity of symptom management targeting fatigue,pain,and motor symptom cluster in patients with NMOSD.
基金supported by the Fundamental Research Funds for the Central Universities(22120240094)Humanities and Social Science Fund of Ministry of Education China(22YJA630082).
文摘Failure mode and effect analysis(FMEA)is a preven-tative risk evaluation method used to evaluate and eliminate fail-ure modes within a system.However,the traditional FMEA method exhibits many deficiencies that pose challenges in prac-tical applications.To improve the conventional FMEA,many modified FMEA models have been suggested.However,the majority of them inadequately address consensus issues and focus on achieving a complete ranking of failure modes.In this research,we propose a new FMEA approach that integrates a two-stage consensus reaching model and a density peak clus-tering algorithm for the assessment and clustering of failure modes.Firstly,we employ the interval 2-tuple linguistic vari-ables(I2TLVs)to express the uncertain risk evaluations provided by FMEA experts.Then,a two-stage consensus reaching model is adopted to enable FMEA experts to reach a consensus.Next,failure modes are categorized into several risk clusters using a density peak clustering algorithm.Finally,the proposed FMEA is illustrated by a case study of load-bearing guidance devices of subway systems.The results show that the proposed FMEA model can more easily to describe the uncertain risk information of failure modes by using the I2TLVs;the introduction of an endogenous feedback mechanism and an exogenous feedback mechanism can accelerate the process of consensus reaching;and the density peak clustering of failure modes successfully improves the practical applicability of FMEA.
文摘Objective To improve the efficiency of patent clustering related to COVID-19 through the topic extraction algorithm and BERT model,and to help researchers understand the patent applications for novel corona virus.Methods The weights of topic vector and BERT model vector were adjusted by cross-entropy loss algorithm to obtain joint vector.Then,k-means++algorithm was used for patent clustering after dimension reduction.Results and Conclusion The model was applied to patents for corona virus drugs,and five clustering topics were generated.Through comparison,it is proved that the clustering results of this model are more centralized and the differentiation between clusters is significant.The five clusters generated are visually analyzed to reveal the development status of patents for corona virus drugs.
基金Supported by Key Project of New Product Development in Yunnan Province(2009BB006)~~
文摘Inter-simple sequence repeat(ISSR) molecular markers were applied to analyze the genetic diversity and clustering of 48 introduced and bred cultivars of Olea euyopaea L. Totally 106 DNA bands were amplified by 11 screened primers, including 99 polymorphic bands; the percentage of polymorphic loci was 93.40%, indicating a rich genetic diversity in Olea euyopaea L. germplasm resources. Based on Nei's genetic distances between various cultivars, a dendrogram of 48 cultivars of Olea euyopaea L. was constructed using unweighted pair-group(UPMGA)method,which showed that 48 cultivars were clustered into four main categories; 84.6% of native cultivars were clustered into two categories; most of introduced cultivars were clustered based on their sources and main usages but not on their geographic origins. This study will provide references for the utilization and further genetic improvement of Olea euyopaea L. germplasm resources.
基金The National Natural Science Foundation of China(No.50674086)Specialized Research Fund for the Doctoral Program of Higher Education(No.20060290508)the Postdoctoral Scientific Program of Jiangsu Province(No.0701045B)
文摘In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising data based on a semantic description in coal mines is studied.First,the semantic and numerical-based hybrid description method of security supervising data in coal mines is described.Secondly,the similarity measurement method of semantic and numerical data are separately given and a weight-based hybrid similarity measurement method for the security supervising data based on a semantic description in coal mines is presented.Thirdly,taking the hybrid similarity measurement method as the distance criteria and using a grid methodology for reference,an improved CURE clustering algorithm based on the grid is presented.Finally,the simulation results of a security supervising data set in coal mines validate the efficiency of the algorithm.
基金Supported by the National Natural Science Foundation of China(30860147)Open Funds of National Key Laboratory of Crop Genetic Improvement(ZK200902)Natural Science Foundation of Yunnan Province(2011FB117)~~
文摘[Objective] This study aimed to develop ACGM markers for the clustering analysis of large grained Brassica napus materials. [Method] A total of 44 pairs of ACGM primers were designed according to 18 genes related to Arabidopsis grain development and their homologous rape EST sequences. After electrophoresis, 18 pairs of ACGM primers were selected for the clustering analysis of 16 larger grained samples and four fine grained samples of rapeseed. [Result] PCR result showed that 2-6 specific bands were respectively amplified by each pair of primes, and all the bands were polymorphic and repeatable, suggesting that the optimized ACGM markers were useful for clustering analysis of B. napus species. Clustering analysis revealed that the 20 rapeseed samples were divided into three clusters A, B, and C at similarity coefficient 0.6. Then, the clusters A and B were further divided into five sub clusters A1, A2, A3, B1 and B2 at similarity coefficient 0.67. [Conclusion] This study will provide theoretical and practical values for rape breeding.
基金Supported by Fund of Sichuan Provincial Administration of traditional Chinese Medicine(2008-12)~~
文摘[Objective] This study aimed to investigate the trace elements in Rehman- nia glutinosa Libosch. by using principal component analysis and clustering analysis. [Method] Principal component analysis and clustering analysis of R. glutinosa medicinal materials from different sources were conducted with contents of six trace elements as indices. [Result] The principal component analysis could comprehen- sively evaluate the quality of R. glutinosa samples with objective results which was consistent with the results of clustering analysis. [Conclusion] Principal component analysis and clustering analysis methods can be used for the quality evaluation of Chinese medicinal materials with multiple indices.
基金funding support from the National Natural Science Foundation of China(Grant No.42007269)the Young Talent Fund of Xi'an Association for Science and Technology(Grant No.959202313094)the Fundamental Research Funds for the Central Universities,CHD(Grant No.300102263401).
文摘The characterization and clustering of rock discontinuity sets are a crucial and challenging task in rock mechanics and geotechnical engineering.Over the past few decades,the clustering of discontinuity sets has undergone rapid and remarkable development.However,there is no relevant literature summarizing these achievements,and this paper attempts to elaborate on the current status and prospects in this field.Specifically,this review aims to discuss the development process of clustering methods for discontinuity sets and the state-of-the-art relevant algorithms.First,we introduce the importance of discontinuity clustering analysis and follow the comprehensive characterization approaches of discontinuity data.A bibliometric analysis is subsequently conducted to clarify the current status and development characteristics of the clustering of discontinuity sets.The methods for the clustering analysis of rock discontinuities are reviewed in terms of single-and multi-parameter clustering methods.Single-parameter methods can be classified into empirical judgment methods,dynamic clustering methods,relative static clustering methods,and static clustering methods,reflecting the continuous optimization and improvement of clustering algorithms.Moreover,this paper compares the current mainstream of single-parameter clustering methods with multi-parameter clustering methods.It is emphasized that the current single-parameter clustering methods have reached their performance limits,with little room for improvement,and that there is a need to extend the study of multi-parameter clustering methods.Finally,several suggestions are offered for future research on the clustering of discontinuity sets.
基金the National Natural Science Foundation of China(Nos.42188102,42120104006,41976117,42176111)the Fundamental Research Funds for the Central Universities(Nos.202172002,201812002)the funding from Andrew Mc Minn。
文摘Pseudoalteromonas is a group of marine bacteria widespread in diverse marine sediments,producing a wide range of bioactive compounds.However,only a limited number of Pseudoalteromonas phages have been isolated and studied.In this study,a novel lytic Pseudoalteromonas phage,denoted as vB_PalP_Y7,was isolated from sewage samples collected at the Seafood Market in Qingdao,China.vB_PalP_Y7 remained stable across a wide range of temperatures(-20–50℃)and a wide pH range(3–12).The vB_PalP_Y7 phage harbors a linear double-stranded DNA molecule of 57699 base pairs(bp)with a G+C content of 45.90%.Furthermore,it is predicted to contain 58 open reading frames(ORFs).Phylogenetic analysis and protein network relationship analysis revealed low similarity between vB_PalP_Y7 and viruses in the ICTV and IMG/VR4 database,suggesting that vB_PalP_Y7 may be a potential new genus,Miuvirus.This study contributed valuable insights to comprehend the relationship between Pseudoalteromonas phages and their host organisms.
文摘Efficient iterative unsupervised machine learning involving probabilistic clustering analysis with the expectation-maximization(EM)clustering algorithm is applied to categorize reservoir facies by exploiting latent and observable well-log variables from a clastic reservoir in the Majnoon oilfield,southern Iraq.The observable well-log variables consist of conventional open-hole,well-log data and the computer-processed interpretation of gamma rays,bulk density,neutron porosity,compressional sonic,deep resistivity,shale volume,total porosity,and water saturation,from three wells located in the Nahr Umr reservoir.The latent variables include shale volume and water saturation.The EM algorithm efficiently characterizes electrofacies through iterative machine learning to identify the local maximum likelihood estimates(MLE)of the observable and latent variables in the studied dataset.The optimized EM model developed successfully predicts the core-derived facies classification in two of the studied wells.The EM model clusters the data into three distinctive reservoir electrofacies(F1,F2,and F3).F1 represents a gas-bearing electrofacies with low shale volume(Vsh)and water saturation(Sw)and high porosity and permeability values identifying it as an attractive reservoir target.The results of the EM model are validated using nuclear magnetic resonance(NMR)data from the third studied well for which no cores were recovered.The NMR results confirm the effectiveness and accuracy of the EM model in predicting electrofacies.The utilization of the EM algorithm for electrofacies classification/cluster analysis is innovative.Specifically,the clusters it establishes are less rigidly constrained than those derived from the more commonly used K-means clustering method.The EM methodology developed generates dependable electrofacies estimates in the studied reservoir intervals where core samples are not available.Therefore,once calibrated with core data in some wells,the model is suitable for application to other wells that lack core data.
文摘A significant portion of Landslide Early Warning Systems (LEWS) relies on the definition of operational thresholds and the monitoring of cumulative rainfall for alert issuance. These thresholds can be obtained in various ways, but most often they are based on previous landslide data. This approach introduces several limitations. For instance, there is a requirement for the location to have been previously monitored in some way to have this type of information recorded. Another significant limitation is the need for information regarding the location and timing of incidents. Despite the current ease of obtaining location information (GPS, drone images, etc.), the timing of the event remains challenging to ascertain for a considerable portion of landslide data. Concerning rainfall monitoring, there are multiple ways to consider it, for instance, examining accumulations over various intervals (1 h, 6 h, 24 h, 72 h), as well as in the calculation of effective rainfall, which represents the precipitation that actually infiltrates the soil. However, in the vast majority of cases, both the thresholds and the rain monitoring approach are defined manually and subjectively, relying on the operators’ experience. This makes the process labor-intensive and time-consuming, hindering the establishment of a truly standardized and rapidly scalable methodology on a large scale. In this work, we propose a Landslides Early Warning System (LEWS) based on the concept of rainfall half-life and the determination of thresholds using Cluster Analysis and data inversion. The system is designed to be applied in extensive monitoring networks, such as the one utilized by Cemaden, Brazil’s National Center for Monitoring and Early Warning of Natural Disasters.
文摘This paper investigates the design essence of Chinese classical private gardens,integrating their design elements and fundamental principles.It systematically analyzes the unique characteristics and differences among classical private gardens in the Northern,Jiangnan,and Lingnan regions.The study examines nine classical private gardens from Northern China,Jiangnan,and Lingnan by utilizing the advanced tool of principal component cluster analysis.Based on literature analysis and field research,273 variables were selected for principal component analysis,from which four components with higher contribution rates were chosen for further study.Subsequently,we employed clustering analysis techniques to compare the differences among the three types of gardens.The results reveal that the first principal component effectively highlights the differences between Jiangnan and Lingnan private gardens.The second principal component serves as the key to defining the types of Northern private gardens and distinguishing them from the other two types,and the third principal component indicates that Lingnan private gardens can be categorized into two distinct types as well.
文摘Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.
基金The National Natural Science Foundation of China,Grant/Award Number:52178080Major Research Project of the Hospital Management Research Institute of the National Health Commission,Grant/Award Number:GY2023011National Institute of Hospital Administration Management of China,Grant/Award Number:GY2023049。
文摘Remarkable progress has been made in infection prevention and control(IPC)in many countries,but some gaps emerged in the context of the coronavirus disease 2019(COVID-19)pandemic.Core capabilities such as standard clinical precautions and tracing the source of infection were the focus of IPC in medical institutions during the pandemic.Therefore,the core competences of IPC professionals during the pandemic,and how these contributed to successful prevention and control of the epidemic,should be studied.To investigate,using a systematic review and cluster analysis,fundamental improvements in the competences of infection control and prevention professionals that may be emphasized in light of the COVID-19 pandemic.We searched the PubMed,Embase,Cochrane Library,Web of Science,CNKI,WanFang Data,and CBM databases for original articles exploring core competencies of IPC professionals during the COVID-19 pandemic(from January 1,2020 to February 7,2023).Weiciyun software was used for data extraction and the Donohue formula was followed to distinguish high-frequency technical terms.Cluster analysis was performed using the within-group linkage method and squared Euclidean distance as the metric to determine the priority competencies for development.We identified 46 studies with 29 high-frequency technical terms.The most common term was“infection prevention and control training”(184 times,17.3%),followed by“hand hygiene”(172 times,16.2%).“Infection prevention and control in clinical practice”was the most-reported core competency(367 times,34.5%),followed by“microbiology and surveillance”(292 times,27.5%).Cluster analysis showed two key areas of competence:Category 1(program management and leadership,patient safety and occupational health,education and microbiology and surveillance)and Category 2(IPC in clinical practice).During the COVID-19 pandemic,IPC program management and leadership,microbiology and surveillance,education,patient safety,and occupational health were the most important focus of development and should be given due consideration by IPC professionals.
文摘The goal of this study was to optimize the constitutive parameters of foundation soils using a k-means algorithm with clustering analysis. A database was collected from unconfined compression tests, Proctor tests and grain distribution tests of soils taken from three different types of foundation pits: raft foundations, partial raft foundations and strip foundations. k-means algorithm with clustering analysis was applied to determine the most appropriate foundation type given the un- confined compression strengths and other parameters of the different soils.
基金This work was supported by National Natural Science Foundation of China(Nos.U1562218,41604107,and 41804126).
文摘Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi cation in the application of this technology.This paper introduces a spectral clustering technique for unsupervised seismic facies analysis.This algorithm is based on on the idea of a graph to cluster the data.Its kem is that seismic data are regarded as points in space,points can be connected with the edge and construct to graphs.When the graphs are divided,the weights of the edges between the different subgraphs are as low as possible,whereas the weights of the inner edges of the subgraph should be as high as possible.That has high computational complexity and entails large memory consumption for spectral clustering algorithm.To solve the problem this paper introduces the idea of sparse representation into spectral clustering.Through the selection of a small number of local sparse representation points,the spectral clustering matrix of all sample points is approximately represented to reduce the cost of spectral clustering operation.Verifi cation of physical model and fi eld data shows that the proposed approach can obtain more accurate seismic facies classification results without considering the data meet any hypothesis.The computing efficiency of this new method is better than that of the conventional spectral clustering method,thereby meeting the application needs of fi eld seismic data.
基金supported by the Science and Technology Project of Henan Provincial Science and Technology Department (No.0424490012 )Major Program of Henan Institute of Science and Technology (No.040132)
文摘Five factors expressing greenbelt quality and one factor expressing quantity were adopted for evaluation of the residential greenbelt, and the AHP (Analytical Hierarchy Process) method was used to determine the value of factors. Thirty residential areas were selected as the samples. Two principal components were extracted and their expression was constructed by method of factor anlysis, therefore, quality evaluation of residential greenbelt was obtained. The accuracy of the function and implement quality classification toward the residential greenbelts in Xinxiang City were validated by clustering analysis method. The results showed that the greenbelt quality of fourteen residential areas was higher than the average level, of which eleven were newly-built residential areas. The 30 residential areas were classified into three types according to their greenbelt features and their formation by clustering analysis method. Finally rational proposal basing on aforesaid evaluating results was proposed for construction and renewal of residential greenbelt, upon which directive basis was provided for construction and renewal of residential greenbelt.
基金National Science Foundation of China(91637105,41775048 and 41475041)National Key R&D Program of China(2018YFC1507800)Research on Tourism Traffic Meteorological Service Products in Heilongjiang Province(HQZD2017004)
文摘An evaluation index is a prerequisite for the scientific evaluation of a public meteorological service.This paper aims to explore a technical method for determining and screening evaluation indicators.Based on public satisfaction survey data obtained in Wafangdian,China in 2010,this study investigates the suitability of fuzzy clustering analysis method in establishing an evaluation index.Through quantitative analysis of multilayer fuzzy clustering of various evaluation indicators,correlation analysis indicates that if the results of clustering were identical for two evaluation indicators in the same sub-evaluation layer,then one indicator could be removed,or the two indicators merged.For evaluation indicators in different sub-evaluation layers,although clustering reveals attribute correlations,these indicators may not be substituted for one another.Analysis of the applicability of the fuzzy clustering method shows that it plays a certain role in the establishment and correction of an evaluation index.
基金This work has been supported by.Central University Research Fund(No.2016MS116,No.2016MS117,No.2018MS074)the National Natural Science Foundation(51677072).
文摘Effective storage,processing and analyzing of power device condition monitoring data faces enormous challenges.A framework is proposed that can support both MapReduce and Graph for massive monitoring data analysis at the same time based on Aliyun DTplus platform.First,power device condition monitoring data storage based on MaxCompute table and parallel permutation entropy feature extraction based on MaxCompute MapReduce are designed and implemented on DTplus platform.Then,Graph based k-means algorithm is implemented and used for massive condition monitoring data clustering analysis.Finally,performance tests are performed to compare the execution time between serial program and parallel program.Performance is analyzed from CPU cores consumption,memory utilization and parallel granularity.Experimental results show that the designed framework and parallel algorithms can efficiently process massive power device condition monitoring data.
文摘A novel multivariate similarity clustering analysis (MSCA) approach was used to estimate a biogeographical division scheme for the global terrestrial fauna and was compared against other widely used clustering algorithms. The faunal dataset included almost all terrestrial and freshwater fauna, a total of 4631 families, 141,814 genera, and 1,334,834 species. Our findings demonstrated that suitable results were only obtained with the MSCA method, which was associated with distinct hierarchies, reasonable structuring, and furthermore, conformed to biogeographical criteria. A total of seven kingdoms and 20 sub-kingdoms were identified. We discovered that the clustering results for the higher and lower animals did not differ significantly, leading us to consider that the analysis result is convincing as the first zoogeographical division scheme for global all terrestrial animals.