Zero-shot learning enables the recognition of new class samples by migrating models learned from semanticfeatures and existing sample features to things that have never been seen before. The problems of consistencyof ...Zero-shot learning enables the recognition of new class samples by migrating models learned from semanticfeatures and existing sample features to things that have never been seen before. The problems of consistencyof different types of features and domain shift problems are two of the critical issues in zero-shot learning. Toaddress both of these issues, this paper proposes a new modeling structure. The traditional approach mappedsemantic features and visual features into the same feature space;based on this, a dual discriminator approachis used in the proposed model. This dual discriminator approach can further enhance the consistency betweensemantic and visual features. At the same time, this approach can also align unseen class semantic features andtraining set samples, providing a portion of information about the unseen classes. In addition, a new feature fusionmethod is proposed in the model. This method is equivalent to adding perturbation to the seen class features,which can reduce the degree to which the classification results in the model are biased towards the seen classes.At the same time, this feature fusion method can provide part of the information of the unseen classes, improvingits classification accuracy in generalized zero-shot learning and reducing domain bias. The proposed method isvalidated and compared with othermethods on four datasets, and fromthe experimental results, it can be seen thatthe method proposed in this paper achieves promising results.展开更多
Acquiring accurate molecular-level information about petroleum is crucial for refining and chemical enterprises to implement the“selection of the optimal processing route”strategy.With the development of data predic...Acquiring accurate molecular-level information about petroleum is crucial for refining and chemical enterprises to implement the“selection of the optimal processing route”strategy.With the development of data prediction systems represented by machine learning,it has become possible for real-time prediction systems of petroleum fraction molecular information to replace analyses such as gas chromatography and mass spectrometry.However,the biggest difficulty lies in acquiring the data required for training the neural network.To address these issues,this work proposes an innovative method that utilizes the Aspen HYSYS and full two-dimensional gas chromatography-time-of-flight mass spectrometry to establish a comprehensive training database.Subsequently,a deep neural network prediction model is developed for heavy distillate oil to predict its composition in terms of molecular structure.After training,the model accurately predicts the molecular composition of catalytically cracked raw oil in a refinery.The validation and test sets exhibit R2 values of 0.99769 and 0.99807,respectively,and the average relative error of molecular composition prediction for raw materials of the catalytic cracking unit is less than 7%.Finally,the SHAP(SHapley Additive ExPlanation)interpretation method is used to disclose the relationship among different variables by performing global and local weight comparisons and correlation analyses.展开更多
As an essential function of encrypted Internet traffic analysis,encrypted traffic service classification can support both coarse-grained network service traffic management and security supervision.However,the traditio...As an essential function of encrypted Internet traffic analysis,encrypted traffic service classification can support both coarse-grained network service traffic management and security supervision.However,the traditional plaintext-based Deep Packet Inspection(DPI)method cannot be applied to such a classification.Moreover,machine learning-based existing methods encounter two problems during feature selection:complex feature overcost processing and Transport Layer Security(TLS)version discrepancy.In this paper,we consider differences between encryption network protocol stacks and propose a composite deep learning-based method in multiprotocol environments using a sliding multiple Protocol Data Unit(multiPDU)length sequence as features by fully utilizing the Markov property in a multiPDU length sequence and maintaining suitability with a TLS-1.3 environment.Control experiments show that both Length-Sensitive(LS)composite deep learning model using a capsule neural network and LS-long short time memory achieve satisfactory effectiveness in F1-score and performance.Owing to faster feature extraction,our method is suitable for actual network environments and superior to state-of-the-art methods.展开更多
Mechanical metamaterials such as auxetic materials have attracted great interest due to their unusual properties that are dictated by their architectures.However,these architected materials usually have low stiffness ...Mechanical metamaterials such as auxetic materials have attracted great interest due to their unusual properties that are dictated by their architectures.However,these architected materials usually have low stiffness because of the bending or rotation deformation mechanisms in the microstructures.In this work,a convolutional neural network(CNN)based self-learning multi-objective optimization is performed to design digital composite materials.The CNN models have undergone rigorous training using randomly generated two-phase digital composite materials,along with their corresponding Poisson's ratios and stiffness values.Then the CNN models are used for designing composite material structures with the minimum Poisson's ratio at a given volume fraction constraint.Furthermore,we have designed composite materials with optimized stiffness while exhibiting a desired Poisson's ratio(negative,zero,or positive).The optimized designs have been successfully and efficiently obtained,and their validity has been confirmed through finite element analysis results.This self-learning multi-objective optimization model offers a promising approach for achieving comprehensive multi-objective optimization.展开更多
With the fast development of business logic and information technology, today's best solutions are tomorrow's legacy systems. In China, the situation in the education domain follows the same path. Currently, there e...With the fast development of business logic and information technology, today's best solutions are tomorrow's legacy systems. In China, the situation in the education domain follows the same path. Currently, there exists a number of e-learning legacy assets with accumulated practical business experience, such as program resource, usage behaviour data resource, and so on. In order to use these legacy assets adequately and efficiently, we should not only utilize the explicit assets but also discover the hidden assets. The usage behaviour data resource is the set of practical operation sequences requested by all users. The hidden patterns in this data resource will provide users' practical experiences, which can benefit the service composition in service-oriented architecture (SOA) migration. Namely, these discovered patterns will be the candidate composite services (coarse-grained) in SOA systems. Although data mining techniques have been used for software engineering tasks, little is known about how they can be used for service composition of migrating an e-learning legacy system (MELS) to SOA. In this paper, we propose a service composition approach based on sequence mining techniques for MELS. Composite services found by this approach will be the complementation of business logic analysis results of MELS. The core of this approach is to develop an appropriate sequence mining algorithm for mining related data collected from an e-learning legacy system. According to the features of execution trace data on usage behaviour from this e-learning legacy system and needs of further pattern analysis, we propose a sequential mining algorithm to mine this kind of data of tile legacy system. For validation, this approach has been applied to the corresponding real data, which was collected from the e-learning legacy system; meanwhile, some investigation questionnaires were set up to collect satisfaction data. The investigation result is 90% the same with the result obtained through our approach.展开更多
With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is ri...With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is rigid and cannot be easily adapted to the dynamic Web environment.To address these challenges,the geographic information service composition(GISC) problem as a sequential decision-making task is modeled.In addition,the Markov decision process(MDP),as a universal model for the planning problem of agents,is used to describe the GISC problem.Then,to achieve self-adaptivity and optimization in a dynamic environment,a novel approach that integrates Monte Carlo tree search(MCTS) and a temporal-difference(TD) learning algorithm is proposed.The concrete services of abstract services are determined with optimal policies and adaptive capability at runtime,based on the environment and the status of component services.The simulation experiment is performed to demonstrate the effectiveness and efficiency through learning quality and performance.展开更多
Here,a new integrated machine learning and Chou’s pseudo amino acid composition method has been proposed for in silico epitope mapping of severe acute respiratorysyndrome-like coronavirus antigens.For this,a training...Here,a new integrated machine learning and Chou’s pseudo amino acid composition method has been proposed for in silico epitope mapping of severe acute respiratorysyndrome-like coronavirus antigens.For this,a training dataset including 266 linear B-cell epitopes,1,267 T-cell epitopes and 1,280 non-epitopes were prepared.The epitope sequences were then converted to numerical vectors using Chou’s pseudo amino acid composition method.The vectors were then introduced to the support vector machine,random forest,artificial neural network,and K-nearest neighbor algorithms for the classification process.The algorithm with the highest performance was selected for the epitope mapping procedure.Based on the obtained results,the random forest algorithm was the most accurate classifier with an accuracy of 0.934 followed by K-nearest neighbor,artificial neural network,and support vector machine respectively.Furthermore,the efficacies of predicted epitopes by the trained random forest algorithm were assessed through their antigenicity potential as well as affinity to human B cell receptor and MHC-I/II alleles using the VaxiJen score and molecular docking,respectively.It was also clear that the predicted epitopes especially the B-cell epitopes had high antigenicity potentials and good affinities to the protein targets.According to the results,the suggested method can be considered for developing specific epitope predictor software as well as an accelerator pipeline for designing serotype independent vaccine against the virus.展开更多
The goal of zero-shot recognition is to classify classes it has never seen before, which needs to build a bridge between seen and unseen classes through semantic embedding space. Therefore, semantic embedding space le...The goal of zero-shot recognition is to classify classes it has never seen before, which needs to build a bridge between seen and unseen classes through semantic embedding space. Therefore, semantic embedding space learning plays an important role in zero-shot recognition. Among existing works, semantic embedding space is mainly taken by user-defined attribute vectors. However, the discriminative information included in the user-defined attribute vector is limited. In this paper, we propose to learn an extra latent attribute space automatically to produce a more generalized and discriminative semantic embedded space. To prevent the bias problem, both user-defined attribute vector and latent attribute space are optimized by adversarial learning with auto-encoders. We also propose to reconstruct semantic patterns produced by explanatory graphs, which can make semantic embedding space more sensitive to usefully semantic information and less sensitive to useless information. The proposed method is evaluated on the AwA2 and CUB dataset. These results show that our proposed method achieves superior performance.展开更多
An observer-based adaptive iterative learning control (AILC) scheme is developed for a class of nonlinear systems with unknown time-varying parameters and unknown time-varying delays. The linear matrix inequality (...An observer-based adaptive iterative learning control (AILC) scheme is developed for a class of nonlinear systems with unknown time-varying parameters and unknown time-varying delays. The linear matrix inequality (LMI) method is employed to design the nonlinear observer. The designed controller contains a proportional-integral-derivative (PID) feedback term in time domain. The learning law of unknown constant parameter is differential-difference-type, and the learning law of unknown time-varying parameter is difference-type. It is assumed that the unknown delay-dependent uncertainty is nonlinearly parameterized. By constructing a Lyapunov-Krasovskii-like composite energy function (CEF), we prove the boundedness of all closed-loop signals and the convergence of tracking error. A simulation example is provided to illustrate the effectiveness of the control algorithm proposed in this paper.展开更多
This paper explores the adaptive iterative learning control method in the control of fractional order systems for the first time. An adaptive iterative learning control(AILC) scheme is presented for a class of commens...This paper explores the adaptive iterative learning control method in the control of fractional order systems for the first time. An adaptive iterative learning control(AILC) scheme is presented for a class of commensurate high-order uncertain nonlinear fractional order systems in the presence of disturbance.To facilitate the controller design, a sliding mode surface of tracking errors is designed by using sufficient conditions of linear fractional order systems. To relax the assumption of the identical initial condition in iterative learning control(ILC), a new boundary layer function is proposed by employing MittagLeffler function. The uncertainty in the system is compensated for by utilizing radial basis function neural network. Fractional order differential type updating laws and difference type learning law are designed to estimate unknown constant parameters and time-varying parameter, respectively. The hyperbolic tangent function and a convergent series sequence are used to design robust control term for neural network approximation error and bounded disturbance, simultaneously guaranteeing the learning convergence along iteration. The system output is proved to converge to a small neighborhood of the desired trajectory by constructing Lyapnov-like composite energy function(CEF)containing new integral type Lyapunov function, while keeping all the closed-loop signals bounded. Finally, a simulation example is presented to verify the effectiveness of the proposed approach.展开更多
Data-mining techniques using machine learning are powerful and efficient for materials design, possessing great potential for discovering new materials with good characteristics. Here, this technique has been used on ...Data-mining techniques using machine learning are powerful and efficient for materials design, possessing great potential for discovering new materials with good characteristics. Here, this technique has been used on composition design for La(Fe,Si/Al)(13)-based materials, which are regarded as one of the most promising magnetic refrigerants in practice. Three prediction models are built by using a machine learning algorithm called gradient boosting regression tree(GBRT) to essentially find the correlation between the Curie temperature(TC), maximum value of magnetic entropy change((?SM)(max)),and chemical composition, all of which yield high accuracy in the prediction of TC and(?SM)(max). The performance metric coefficient scores of determination(R^2) for the three models are 0.96, 0.87, and 0.91. These results suggest that all of the models are well-developed predictive models on the challenging issue of generalization ability for untrained data, which can not only provide us with suggestions for real experiments but also help us gain physical insights to find proper composition for further magnetic refrigeration applications.展开更多
Background Breed identification is useful in a variety of biological contexts.Breed identification usually involves two stages,i.e.,detection of breed-informative SNPs and breed assignment.For both stages,there are se...Background Breed identification is useful in a variety of biological contexts.Breed identification usually involves two stages,i.e.,detection of breed-informative SNPs and breed assignment.For both stages,there are several methods proposed.However,what is the optimal combination of these methods remain unclear.In this study,using the whole genome sequence data available for 13 cattle breeds from Run 8 of the 1,000 Bull Genomes Project,we compared the combinations of three methods(Delta,FST,and In)for breed-informative SNP detection and five machine learning methods(KNN,SVM,RF,NB,and ANN)for breed assignment with respect to different reference population sizes and difference numbers of most breed-informative SNPs.In addition,we evaluated the accuracy of breed identification using SNP chip data of different densities.Results We found that all combinations performed quite well with identification accuracies over 95%in all scenarios.However,there was no combination which performed the best and robust across all scenarios.We proposed to inte-grate the three breed-informative detection methods,named DFI,and integrate the three machine learning methods,KNN,SVM,and RF,named KSR.We found that the combination of these two integrated methods outperformed the other combinations with accuracies over 99%in most cases and was very robust in all scenarios.The accuracies from using SNP chip data were only slightly lower than that from using sequence data in most cases.Conclusions The current study showed that the combination of DFI and KSR was the optimal strategy.Using sequence data resulted in higher accuracies than using chip data in most cases.However,the differences were gener-ally small.In view of the cost of genotyping,using chip data is also a good option for breed identification.展开更多
Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fin...Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fine-grained features by training deep models such that similar images are clustered,and dissimilar images are separated in the low embedding space.Previous works primarily focused on defining local structure loss functions like triplet loss,pairwise loss,etc.However,training via these approaches takes a long training time,and they have poor accuracy.Additionally,representations learned through it tend to tighten up in the embedded space and lose generalizability to unseen classes.This paper proposes a noise-assisted representation learning method for fine-grained image retrieval to mitigate these issues.In the proposed work,class manifold learning is performed in which positive pairs are created with noise insertion operation instead of tightening class clusters.And other instances are treated as negatives within the same cluster.Then a loss function is defined to penalize when the distance between instances of the same class becomes too small relative to the noise pair in that class in embedded space.The proposed approach is validated on CARS-196 and CUB-200 datasets and achieved better retrieval results(85.38%recall@1 for CARS-196%and 70.13%recall@1 for CUB-200)compared to other existing methods.展开更多
With the availability of high-performance computing technology and the development of advanced numerical simulation methods, Computational Fluid Dynamics (CFD) is becoming more and more practical and efficient in engi...With the availability of high-performance computing technology and the development of advanced numerical simulation methods, Computational Fluid Dynamics (CFD) is becoming more and more practical and efficient in engineering. As one of the high-precision representative algorithms, the high-order Discontinuous Galerkin Method (DGM) has not only attracted widespread attention from scholars in the CFD research community, but also received strong development. However, when DGM is extended to high-speed aerodynamic flow field calculations, non-physical numerical Gibbs oscillations near shock waves often significantly affect the numerical accuracy and even cause calculation failure. Data driven approaches based on machine learning techniques can be used to learn the characteristics of Gibbs noise, which motivates us to use it in high-speed DG applications. To achieve this goal, labeled data need to be generated in order to train the machine learning models. This paper proposes a new method for denoising modeling of Gibbs phenomenon using a machine learning technique, the zero-shot learning strategy, to eliminate acquiring large amounts of CFD data. The model adopts a graph convolutional network combined with graph attention mechanism to learn the denoising paradigm from synthetic Gibbs noise data and generalize to DGM numerical simulation data. Numerical simulation results show that the Gibbs denoising model proposed in this paper can suppress the numerical oscillation near shock waves in the high-order DGM. Our work automates the extension of DGM to high-speed aerodynamic flow field calculations with higher generalization and lower cost.展开更多
In this study,we present a machine learning-based method to predict trace element concentrations from major and minor element concentration data using a legacy lithogeochemical database of magmatic rocks from the Karo...In this study,we present a machine learning-based method to predict trace element concentrations from major and minor element concentration data using a legacy lithogeochemical database of magmatic rocks from the Karoo large igneous province(Gondwana Supercontinent).Wedemonstrate that a variety of trace elements,including most of the lanthanides,chalcophile,lithophile,and siderophile elements,can be predicted with excellent accuracy.This finding reveals that there are reliable,high-dimensional elemental associations that can be used to predict trace elements in a range of plutonic and volcanic rocks.Since the major and minor elements are used as predictors,prediction performance can be used as a direct proxy for geochemical anomalies.As such,our proposed method is suitable for prospective exploration by identifying anomalous trace element concentrations.Compared to multivariate compositional data analysis methods,the new method does not rely on assumptions of stoichiometric combinations of elements in the data to discover geochemical anomalies.Because we do not use multivariate compositional data analysis techniques(e.g.principal component analysis and combined use of major,minor and trace elements data),we also show that log-ratio transforms do not increase the performance of the proposed approach and are unnecessary for algorithms that are not spatially aware in the feature space.Therefore,we demonstrate that high-dimensional elemental associations can be modelled in an automated manner through a data-driven approach and without assumptions of stoichiometry within the data.The approach proposed in this study can be used as a replacement method to the multivariate compositional data analysis technique that is used for prospectivity mapping,or be used as a pre-processor to reduce the detection of false geochemical anomalies,particularly where the data is of variable quality.展开更多
Gasification of organic waste represents one of the most effective valorization pathways for renewable energy and resources recovery,while this process can be affected by multi-factors like temperature,feedstock,and s...Gasification of organic waste represents one of the most effective valorization pathways for renewable energy and resources recovery,while this process can be affected by multi-factors like temperature,feedstock,and steam content,making the product’s prediction problematic.With the popularization and promotion of artificial intelligence such as machine learning(ML),traditional artificial neural networks have been paid more attention by researchers from the data science field,which provides scientific and engineering communities with flexible and rapid prediction frameworks in the field of organic waste gasification.In this work,critical parameters including temperature,steam ratio,and feedstock during gasification of organic waste were reviewed in three scenarios including steam gasification,air gasification,and oxygen-riched gasification,and the product distribution and involved mechanism were elaborated.Moreover,we presented the details of ML methods like regression analysis,artificial neural networks,decision trees,and related methods,which are expected to revolutionize data analysis and modeling of the gasification of organic waste.Typical outputs including the syngas yield,composition,and HHVs were discussed with a better understanding of the gasification process and ML application.This review focused on the combination of gasification and ML,and it is of immediate significance for the resource and energy utilization of organic waste.展开更多
This paper presents a review of the ensemble learning models proposed for web services classification,selection,and composition.Web service is an evo-lutionary research area,and ensemble learning has become a hot spot...This paper presents a review of the ensemble learning models proposed for web services classification,selection,and composition.Web service is an evo-lutionary research area,and ensemble learning has become a hot spot to assess web services’earlier mentioned aspects.The proposed research aims to review the state of art approaches performed on the interesting web services area.The literature on the research topic is examined using the preferred reporting items for systematic reviews and meta-analyses(PRISMA)as a research method.The study reveals an increasing trend of using ensemble learning in the chosen papers within the last ten years.Naïve Bayes(NB),Support Vector Machine’(SVM),and other classifiers were identified as widely explored in selected studies.Core analysis of web services classification suggests that web services’performance aspects can be investigated in future works.This paper also identified performance measuring metrics,including accuracy,precision,recall,and f-measure,widely used in the literature.展开更多
A crucial task in hyperspectral image(HSI)taxonomy is exploring effective methodologies to effusively practice the 3-D and spectral data delivered by the statistics cube.For classification of images,3-D data is adjudg...A crucial task in hyperspectral image(HSI)taxonomy is exploring effective methodologies to effusively practice the 3-D and spectral data delivered by the statistics cube.For classification of images,3-D data is adjudged in the phases of pre-cataloging,an assortment of a sample,classifiers,post-cataloging,and accurateness estimation.Lastly,a viewpoint on imminent examination directions for proceeding 3-D and spectral approaches is untaken.In topical years,sparse representation is acknowledged as a dominant classification tool to effectually labels deviating difficulties and extensively exploited in several imagery dispensation errands.Encouraged by those efficacious solicitations,sparse representation(SR)has likewise been presented to categorize HSI’s and validated virtuous enactment.This research paper offers an overview of the literature on the classification of HSI technology and its applications.This assessment is centered on a methodical review of SR and support vector machine(SVM)grounded HSI taxonomy works and equates numerous approaches for this matter.We form an outline that splits the equivalent mechanisms into spectral aspects of systems,and spectral–spatial feature networks to methodically analyze the contemporary accomplishments in HSI taxonomy.Furthermore,cogitating the datum that accessible training illustrations in the remote distinguishing arena are generally appropriate restricted besides training neural networks(NNs)to necessitate an enormous integer of illustrations,we comprise certain approaches to increase taxonomy enactment,which can deliver certain strategies for imminent learnings on this issue.Lastly,numerous illustrative neural learning-centered taxonomy approaches are piloted on physical HSI’s in our experimentations.展开更多
Urban sewer pipes are a vital infrastructure in modern cities,and their defects must be detected in time to prevent potential malfunctioning.In recent years,to relieve the manual efforts by human experts,models based ...Urban sewer pipes are a vital infrastructure in modern cities,and their defects must be detected in time to prevent potential malfunctioning.In recent years,to relieve the manual efforts by human experts,models based on deep learning have been introduced to automatically identify potential defects.However,these models are insufficient in terms of dataset complexity,model versatility and performance.Our work addresses these issues with amulti-stage defect detection architecture using a composite backbone Swin Transformer.Themodel based on this architecture is trained using a more comprehensive dataset containingmore classes of defects.By ablation studies on the modules of combined backbone Swin Transformer,multi-stage detector,test-time data augmentation and model fusion,it is revealed that they all contribute to the improvement of detection accuracy from different aspects.The model incorporating all these modules achieves the mean Average Precision(mAP)of 78.6% at an Intersection over Union(IoU)threshold of 0.5.This represents an improvement of 14.1% over the ResNet50 Faster Region-based Convolutional Neural Network(R-CNN)model and a 6.7% improvement over You Only Look Once version 6(YOLOv6)-large,the highest in the YOLO methods.In addition,for other defect detection models for sewer pipes,although direct comparison with themis infeasible due to the unavailability of their private datasets,our results are obtained from a more comprehensive dataset and have superior generalization capabilities.展开更多
文摘Zero-shot learning enables the recognition of new class samples by migrating models learned from semanticfeatures and existing sample features to things that have never been seen before. The problems of consistencyof different types of features and domain shift problems are two of the critical issues in zero-shot learning. Toaddress both of these issues, this paper proposes a new modeling structure. The traditional approach mappedsemantic features and visual features into the same feature space;based on this, a dual discriminator approachis used in the proposed model. This dual discriminator approach can further enhance the consistency betweensemantic and visual features. At the same time, this approach can also align unseen class semantic features andtraining set samples, providing a portion of information about the unseen classes. In addition, a new feature fusionmethod is proposed in the model. This method is equivalent to adding perturbation to the seen class features,which can reduce the degree to which the classification results in the model are biased towards the seen classes.At the same time, this feature fusion method can provide part of the information of the unseen classes, improvingits classification accuracy in generalized zero-shot learning and reducing domain bias. The proposed method isvalidated and compared with othermethods on four datasets, and fromthe experimental results, it can be seen thatthe method proposed in this paper achieves promising results.
基金the National Natural Science Foundation of China(22108307)the Natural Science Foundation of Shandong Province(ZR2020KB006)the Outstanding Youth Fund of Shandong Provincial Natural Science Foundation(ZR2020YQ17).
文摘Acquiring accurate molecular-level information about petroleum is crucial for refining and chemical enterprises to implement the“selection of the optimal processing route”strategy.With the development of data prediction systems represented by machine learning,it has become possible for real-time prediction systems of petroleum fraction molecular information to replace analyses such as gas chromatography and mass spectrometry.However,the biggest difficulty lies in acquiring the data required for training the neural network.To address these issues,this work proposes an innovative method that utilizes the Aspen HYSYS and full two-dimensional gas chromatography-time-of-flight mass spectrometry to establish a comprehensive training database.Subsequently,a deep neural network prediction model is developed for heavy distillate oil to predict its composition in terms of molecular structure.After training,the model accurately predicts the molecular composition of catalytically cracked raw oil in a refinery.The validation and test sets exhibit R2 values of 0.99769 and 0.99807,respectively,and the average relative error of molecular composition prediction for raw materials of the catalytic cracking unit is less than 7%.Finally,the SHAP(SHapley Additive ExPlanation)interpretation method is used to disclose the relationship among different variables by performing global and local weight comparisons and correlation analyses.
基金supported by the General Program of the National Natural Science Foundation of China under Grant No.62172093the National Key R&D Program of China under Grant No.2018YFB1800602+1 种基金2019 Industrial Internet Innovation and Development Project,Ministry of Industry and Information Technology(MIIT)under Grant No.6709010003Ministry of Education-China Mobile Research Fund under Grant No.MCM20180506。
文摘As an essential function of encrypted Internet traffic analysis,encrypted traffic service classification can support both coarse-grained network service traffic management and security supervision.However,the traditional plaintext-based Deep Packet Inspection(DPI)method cannot be applied to such a classification.Moreover,machine learning-based existing methods encounter two problems during feature selection:complex feature overcost processing and Transport Layer Security(TLS)version discrepancy.In this paper,we consider differences between encryption network protocol stacks and propose a composite deep learning-based method in multiprotocol environments using a sliding multiple Protocol Data Unit(multiPDU)length sequence as features by fully utilizing the Markov property in a multiPDU length sequence and maintaining suitability with a TLS-1.3 environment.Control experiments show that both Length-Sensitive(LS)composite deep learning model using a capsule neural network and LS-long short time memory achieve satisfactory effectiveness in F1-score and performance.Owing to faster feature extraction,our method is suitable for actual network environments and superior to state-of-the-art methods.
文摘Mechanical metamaterials such as auxetic materials have attracted great interest due to their unusual properties that are dictated by their architectures.However,these architected materials usually have low stiffness because of the bending or rotation deformation mechanisms in the microstructures.In this work,a convolutional neural network(CNN)based self-learning multi-objective optimization is performed to design digital composite materials.The CNN models have undergone rigorous training using randomly generated two-phase digital composite materials,along with their corresponding Poisson's ratios and stiffness values.Then the CNN models are used for designing composite material structures with the minimum Poisson's ratio at a given volume fraction constraint.Furthermore,we have designed composite materials with optimized stiffness while exhibiting a desired Poisson's ratio(negative,zero,or positive).The optimized designs have been successfully and efficiently obtained,and their validity has been confirmed through finite element analysis results.This self-learning multi-objective optimization model offers a promising approach for achieving comprehensive multi-objective optimization.
基金supported by E-learning Platform, National Torch Project (No. z20040010)
文摘With the fast development of business logic and information technology, today's best solutions are tomorrow's legacy systems. In China, the situation in the education domain follows the same path. Currently, there exists a number of e-learning legacy assets with accumulated practical business experience, such as program resource, usage behaviour data resource, and so on. In order to use these legacy assets adequately and efficiently, we should not only utilize the explicit assets but also discover the hidden assets. The usage behaviour data resource is the set of practical operation sequences requested by all users. The hidden patterns in this data resource will provide users' practical experiences, which can benefit the service composition in service-oriented architecture (SOA) migration. Namely, these discovered patterns will be the candidate composite services (coarse-grained) in SOA systems. Although data mining techniques have been used for software engineering tasks, little is known about how they can be used for service composition of migrating an e-learning legacy system (MELS) to SOA. In this paper, we propose a service composition approach based on sequence mining techniques for MELS. Composite services found by this approach will be the complementation of business logic analysis results of MELS. The core of this approach is to develop an appropriate sequence mining algorithm for mining related data collected from an e-learning legacy system. According to the features of execution trace data on usage behaviour from this e-learning legacy system and needs of further pattern analysis, we propose a sequential mining algorithm to mine this kind of data of tile legacy system. For validation, this approach has been applied to the corresponding real data, which was collected from the e-learning legacy system; meanwhile, some investigation questionnaires were set up to collect satisfaction data. The investigation result is 90% the same with the result obtained through our approach.
基金Supported by the National Natural Science Foundation of China(No.41971356,41671400,41701446)National Key Research and Development Program of China(No.2017YFB0503600,2018YFB0505500)Hubei Province Natural Science Foundation of China(No.2017CFB277)。
文摘With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is rigid and cannot be easily adapted to the dynamic Web environment.To address these challenges,the geographic information service composition(GISC) problem as a sequential decision-making task is modeled.In addition,the Markov decision process(MDP),as a universal model for the planning problem of agents,is used to describe the GISC problem.Then,to achieve self-adaptivity and optimization in a dynamic environment,a novel approach that integrates Monte Carlo tree search(MCTS) and a temporal-difference(TD) learning algorithm is proposed.The concrete services of abstract services are determined with optimal policies and adaptive capability at runtime,based on the environment and the status of component services.The simulation experiment is performed to demonstrate the effectiveness and efficiency through learning quality and performance.
文摘Here,a new integrated machine learning and Chou’s pseudo amino acid composition method has been proposed for in silico epitope mapping of severe acute respiratorysyndrome-like coronavirus antigens.For this,a training dataset including 266 linear B-cell epitopes,1,267 T-cell epitopes and 1,280 non-epitopes were prepared.The epitope sequences were then converted to numerical vectors using Chou’s pseudo amino acid composition method.The vectors were then introduced to the support vector machine,random forest,artificial neural network,and K-nearest neighbor algorithms for the classification process.The algorithm with the highest performance was selected for the epitope mapping procedure.Based on the obtained results,the random forest algorithm was the most accurate classifier with an accuracy of 0.934 followed by K-nearest neighbor,artificial neural network,and support vector machine respectively.Furthermore,the efficacies of predicted epitopes by the trained random forest algorithm were assessed through their antigenicity potential as well as affinity to human B cell receptor and MHC-I/II alleles using the VaxiJen score and molecular docking,respectively.It was also clear that the predicted epitopes especially the B-cell epitopes had high antigenicity potentials and good affinities to the protein targets.According to the results,the suggested method can be considered for developing specific epitope predictor software as well as an accelerator pipeline for designing serotype independent vaccine against the virus.
文摘The goal of zero-shot recognition is to classify classes it has never seen before, which needs to build a bridge between seen and unseen classes through semantic embedding space. Therefore, semantic embedding space learning plays an important role in zero-shot recognition. Among existing works, semantic embedding space is mainly taken by user-defined attribute vectors. However, the discriminative information included in the user-defined attribute vector is limited. In this paper, we propose to learn an extra latent attribute space automatically to produce a more generalized and discriminative semantic embedded space. To prevent the bias problem, both user-defined attribute vector and latent attribute space are optimized by adversarial learning with auto-encoders. We also propose to reconstruct semantic patterns produced by explanatory graphs, which can make semantic embedding space more sensitive to usefully semantic information and less sensitive to useless information. The proposed method is evaluated on the AwA2 and CUB dataset. These results show that our proposed method achieves superior performance.
基金supported by National Natural Science Foundation of China(No.60804021,No.60702063)
文摘An observer-based adaptive iterative learning control (AILC) scheme is developed for a class of nonlinear systems with unknown time-varying parameters and unknown time-varying delays. The linear matrix inequality (LMI) method is employed to design the nonlinear observer. The designed controller contains a proportional-integral-derivative (PID) feedback term in time domain. The learning law of unknown constant parameter is differential-difference-type, and the learning law of unknown time-varying parameter is difference-type. It is assumed that the unknown delay-dependent uncertainty is nonlinearly parameterized. By constructing a Lyapunov-Krasovskii-like composite energy function (CEF), we prove the boundedness of all closed-loop signals and the convergence of tracking error. A simulation example is provided to illustrate the effectiveness of the control algorithm proposed in this paper.
基金supported by the National Natural Science Foundation of China(60674090)Shandong Natural Science Foundation(ZR2017QF016)
文摘This paper explores the adaptive iterative learning control method in the control of fractional order systems for the first time. An adaptive iterative learning control(AILC) scheme is presented for a class of commensurate high-order uncertain nonlinear fractional order systems in the presence of disturbance.To facilitate the controller design, a sliding mode surface of tracking errors is designed by using sufficient conditions of linear fractional order systems. To relax the assumption of the identical initial condition in iterative learning control(ILC), a new boundary layer function is proposed by employing MittagLeffler function. The uncertainty in the system is compensated for by utilizing radial basis function neural network. Fractional order differential type updating laws and difference type learning law are designed to estimate unknown constant parameters and time-varying parameter, respectively. The hyperbolic tangent function and a convergent series sequence are used to design robust control term for neural network approximation error and bounded disturbance, simultaneously guaranteeing the learning convergence along iteration. The system output is proved to converge to a small neighborhood of the desired trajectory by constructing Lyapnov-like composite energy function(CEF)containing new integral type Lyapunov function, while keeping all the closed-loop signals bounded. Finally, a simulation example is presented to verify the effectiveness of the proposed approach.
基金supported by the National Basic Research Program of China(Grant No.2014CB643702)the National Natural Science Foundation of China(Grant No.51590880)+1 种基金the Knowledge Innovation Project of the Chinese Academy of Sciences(Grant No.KJZD-EW-M05)the National Key Research and Development Program of China(Grant No.2016YFB0700903)
文摘Data-mining techniques using machine learning are powerful and efficient for materials design, possessing great potential for discovering new materials with good characteristics. Here, this technique has been used on composition design for La(Fe,Si/Al)(13)-based materials, which are regarded as one of the most promising magnetic refrigerants in practice. Three prediction models are built by using a machine learning algorithm called gradient boosting regression tree(GBRT) to essentially find the correlation between the Curie temperature(TC), maximum value of magnetic entropy change((?SM)(max)),and chemical composition, all of which yield high accuracy in the prediction of TC and(?SM)(max). The performance metric coefficient scores of determination(R^2) for the three models are 0.96, 0.87, and 0.91. These results suggest that all of the models are well-developed predictive models on the challenging issue of generalization ability for untrained data, which can not only provide us with suggestions for real experiments but also help us gain physical insights to find proper composition for further magnetic refrigeration applications.
基金funded by National Key Research and Development Program of China(2021YFD1200404)the Yangzhou University Interdisciplinary Research Foundation for Animal Science Discipline of Targeted Support(yzuxk202016)the Project of Genetic Improvement for Agricultural Species(Dairy Cattle)of Shandong Province(2019LZGC011).
文摘Background Breed identification is useful in a variety of biological contexts.Breed identification usually involves two stages,i.e.,detection of breed-informative SNPs and breed assignment.For both stages,there are several methods proposed.However,what is the optimal combination of these methods remain unclear.In this study,using the whole genome sequence data available for 13 cattle breeds from Run 8 of the 1,000 Bull Genomes Project,we compared the combinations of three methods(Delta,FST,and In)for breed-informative SNP detection and five machine learning methods(KNN,SVM,RF,NB,and ANN)for breed assignment with respect to different reference population sizes and difference numbers of most breed-informative SNPs.In addition,we evaluated the accuracy of breed identification using SNP chip data of different densities.Results We found that all combinations performed quite well with identification accuracies over 95%in all scenarios.However,there was no combination which performed the best and robust across all scenarios.We proposed to inte-grate the three breed-informative detection methods,named DFI,and integrate the three machine learning methods,KNN,SVM,and RF,named KSR.We found that the combination of these two integrated methods outperformed the other combinations with accuracies over 99%in most cases and was very robust in all scenarios.The accuracies from using SNP chip data were only slightly lower than that from using sequence data in most cases.Conclusions The current study showed that the combination of DFI and KSR was the optimal strategy.Using sequence data resulted in higher accuracies than using chip data in most cases.However,the differences were gener-ally small.In view of the cost of genotyping,using chip data is also a good option for breed identification.
文摘Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fine-grained features by training deep models such that similar images are clustered,and dissimilar images are separated in the low embedding space.Previous works primarily focused on defining local structure loss functions like triplet loss,pairwise loss,etc.However,training via these approaches takes a long training time,and they have poor accuracy.Additionally,representations learned through it tend to tighten up in the embedded space and lose generalizability to unseen classes.This paper proposes a noise-assisted representation learning method for fine-grained image retrieval to mitigate these issues.In the proposed work,class manifold learning is performed in which positive pairs are created with noise insertion operation instead of tightening class clusters.And other instances are treated as negatives within the same cluster.Then a loss function is defined to penalize when the distance between instances of the same class becomes too small relative to the noise pair in that class in embedded space.The proposed approach is validated on CARS-196 and CUB-200 datasets and achieved better retrieval results(85.38%recall@1 for CARS-196%and 70.13%recall@1 for CUB-200)compared to other existing methods.
基金co-supported by the Aeronautical Science Foundation of China(Nos.2018ZA52002,2019ZA052011).
文摘With the availability of high-performance computing technology and the development of advanced numerical simulation methods, Computational Fluid Dynamics (CFD) is becoming more and more practical and efficient in engineering. As one of the high-precision representative algorithms, the high-order Discontinuous Galerkin Method (DGM) has not only attracted widespread attention from scholars in the CFD research community, but also received strong development. However, when DGM is extended to high-speed aerodynamic flow field calculations, non-physical numerical Gibbs oscillations near shock waves often significantly affect the numerical accuracy and even cause calculation failure. Data driven approaches based on machine learning techniques can be used to learn the characteristics of Gibbs noise, which motivates us to use it in high-speed DG applications. To achieve this goal, labeled data need to be generated in order to train the machine learning models. This paper proposes a new method for denoising modeling of Gibbs phenomenon using a machine learning technique, the zero-shot learning strategy, to eliminate acquiring large amounts of CFD data. The model adopts a graph convolutional network combined with graph attention mechanism to learn the denoising paradigm from synthetic Gibbs noise data and generalize to DGM numerical simulation data. Numerical simulation results show that the Gibbs denoising model proposed in this paper can suppress the numerical oscillation near shock waves in the high-order DGM. Our work automates the extension of DGM to high-speed aerodynamic flow field calculations with higher generalization and lower cost.
文摘In this study,we present a machine learning-based method to predict trace element concentrations from major and minor element concentration data using a legacy lithogeochemical database of magmatic rocks from the Karoo large igneous province(Gondwana Supercontinent).Wedemonstrate that a variety of trace elements,including most of the lanthanides,chalcophile,lithophile,and siderophile elements,can be predicted with excellent accuracy.This finding reveals that there are reliable,high-dimensional elemental associations that can be used to predict trace elements in a range of plutonic and volcanic rocks.Since the major and minor elements are used as predictors,prediction performance can be used as a direct proxy for geochemical anomalies.As such,our proposed method is suitable for prospective exploration by identifying anomalous trace element concentrations.Compared to multivariate compositional data analysis methods,the new method does not rely on assumptions of stoichiometric combinations of elements in the data to discover geochemical anomalies.Because we do not use multivariate compositional data analysis techniques(e.g.principal component analysis and combined use of major,minor and trace elements data),we also show that log-ratio transforms do not increase the performance of the proposed approach and are unnecessary for algorithms that are not spatially aware in the feature space.Therefore,we demonstrate that high-dimensional elemental associations can be modelled in an automated manner through a data-driven approach and without assumptions of stoichiometry within the data.The approach proposed in this study can be used as a replacement method to the multivariate compositional data analysis technique that is used for prospectivity mapping,or be used as a pre-processor to reduce the detection of false geochemical anomalies,particularly where the data is of variable quality.
基金This work is supported by Sichuan Science and Technology Program(2021JDR0343)the Project Fund of Chengdu Science and Technology Bureau(2019-YF09-00086-SN).
文摘Gasification of organic waste represents one of the most effective valorization pathways for renewable energy and resources recovery,while this process can be affected by multi-factors like temperature,feedstock,and steam content,making the product’s prediction problematic.With the popularization and promotion of artificial intelligence such as machine learning(ML),traditional artificial neural networks have been paid more attention by researchers from the data science field,which provides scientific and engineering communities with flexible and rapid prediction frameworks in the field of organic waste gasification.In this work,critical parameters including temperature,steam ratio,and feedstock during gasification of organic waste were reviewed in three scenarios including steam gasification,air gasification,and oxygen-riched gasification,and the product distribution and involved mechanism were elaborated.Moreover,we presented the details of ML methods like regression analysis,artificial neural networks,decision trees,and related methods,which are expected to revolutionize data analysis and modeling of the gasification of organic waste.Typical outputs including the syngas yield,composition,and HHVs were discussed with a better understanding of the gasification process and ML application.This review focused on the combination of gasification and ML,and it is of immediate significance for the resource and energy utilization of organic waste.
基金This research was supported by the BK21 FOUR(Fostering Outstanding Universities for Research)the Ministry of Education(MOE,Korea)and National Research Foundation of Korea(NRF).
文摘This paper presents a review of the ensemble learning models proposed for web services classification,selection,and composition.Web service is an evo-lutionary research area,and ensemble learning has become a hot spot to assess web services’earlier mentioned aspects.The proposed research aims to review the state of art approaches performed on the interesting web services area.The literature on the research topic is examined using the preferred reporting items for systematic reviews and meta-analyses(PRISMA)as a research method.The study reveals an increasing trend of using ensemble learning in the chosen papers within the last ten years.Naïve Bayes(NB),Support Vector Machine’(SVM),and other classifiers were identified as widely explored in selected studies.Core analysis of web services classification suggests that web services’performance aspects can be investigated in future works.This paper also identified performance measuring metrics,including accuracy,precision,recall,and f-measure,widely used in the literature.
文摘A crucial task in hyperspectral image(HSI)taxonomy is exploring effective methodologies to effusively practice the 3-D and spectral data delivered by the statistics cube.For classification of images,3-D data is adjudged in the phases of pre-cataloging,an assortment of a sample,classifiers,post-cataloging,and accurateness estimation.Lastly,a viewpoint on imminent examination directions for proceeding 3-D and spectral approaches is untaken.In topical years,sparse representation is acknowledged as a dominant classification tool to effectually labels deviating difficulties and extensively exploited in several imagery dispensation errands.Encouraged by those efficacious solicitations,sparse representation(SR)has likewise been presented to categorize HSI’s and validated virtuous enactment.This research paper offers an overview of the literature on the classification of HSI technology and its applications.This assessment is centered on a methodical review of SR and support vector machine(SVM)grounded HSI taxonomy works and equates numerous approaches for this matter.We form an outline that splits the equivalent mechanisms into spectral aspects of systems,and spectral–spatial feature networks to methodically analyze the contemporary accomplishments in HSI taxonomy.Furthermore,cogitating the datum that accessible training illustrations in the remote distinguishing arena are generally appropriate restricted besides training neural networks(NNs)to necessitate an enormous integer of illustrations,we comprise certain approaches to increase taxonomy enactment,which can deliver certain strategies for imminent learnings on this issue.Lastly,numerous illustrative neural learning-centered taxonomy approaches are piloted on physical HSI’s in our experimentations.
基金supported by the Science and Technology Development Fund of Macao(Grant No.0079/2019/AMJ)the National Key R&D Program of China(No.2019YFE0111400).
文摘Urban sewer pipes are a vital infrastructure in modern cities,and their defects must be detected in time to prevent potential malfunctioning.In recent years,to relieve the manual efforts by human experts,models based on deep learning have been introduced to automatically identify potential defects.However,these models are insufficient in terms of dataset complexity,model versatility and performance.Our work addresses these issues with amulti-stage defect detection architecture using a composite backbone Swin Transformer.Themodel based on this architecture is trained using a more comprehensive dataset containingmore classes of defects.By ablation studies on the modules of combined backbone Swin Transformer,multi-stage detector,test-time data augmentation and model fusion,it is revealed that they all contribute to the improvement of detection accuracy from different aspects.The model incorporating all these modules achieves the mean Average Precision(mAP)of 78.6% at an Intersection over Union(IoU)threshold of 0.5.This represents an improvement of 14.1% over the ResNet50 Faster Region-based Convolutional Neural Network(R-CNN)model and a 6.7% improvement over You Only Look Once version 6(YOLOv6)-large,the highest in the YOLO methods.In addition,for other defect detection models for sewer pipes,although direct comparison with themis infeasible due to the unavailability of their private datasets,our results are obtained from a more comprehensive dataset and have superior generalization capabilities.