Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been pr...Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been proposed.However,unlike DNNs,shallow convolutional neural networks often outperform deeper models in mitigating overfitting,particularly with small datasets.Still,many of these methods rely on a single feature for recognition,resulting in an insufficient ability to extract highly effective features.To address this limitation,in this paper,an Improved Dual-stream Shallow Convolutional Neural Network based on an Extreme Gradient Boosting Algorithm(IDSSCNN-XgBoost)is introduced for ME Recognition.The proposed method utilizes a dual-stream architecture where motion vectors(temporal features)are extracted using Optical Flow TV-L1 and amplify subtle changes(spatial features)via EulerianVideoMagnification(EVM).These features are processed by IDSSCNN,with an attention mechanism applied to refine the extracted effective features.The outputs are then fused,concatenated,and classified using the XgBoost algorithm.This comprehensive approach significantly improves recognition accuracy by leveraging the strengths of both temporal and spatial information,supported by the robust classification power of XgBoost.The proposed method is evaluated on three publicly available ME databases named Chinese Academy of Sciences Micro-expression Database(CASMEII),Spontaneous Micro-Expression Database(SMICHS),and Spontaneous Actions and Micro-Movements(SAMM).Experimental results indicate that the proposed model can achieve outstanding results compared to recent models.The accuracy results are 79.01%,69.22%,and 68.99%on CASMEII,SMIC-HS,and SAMM,and the F1-score are 75.47%,68.91%,and 63.84%,respectively.The proposed method has the advantage of operational efficiency and less computational time.展开更多
In order to directly construct the mapping between multiple state parameters and remaining useful life(RUL),and reduce the interference of random error on prediction accuracy,a RUL prediction model of aeroengine based...In order to directly construct the mapping between multiple state parameters and remaining useful life(RUL),and reduce the interference of random error on prediction accuracy,a RUL prediction model of aeroengine based on principal component analysis(PCA)and one-dimensional convolution neural network(1D-CNN)is proposed in this paper.Firstly,multiple state parameters corresponding to massive cycles of aeroengine are collected and brought into PCA for dimensionality reduction,and principal components are extracted for further time series prediction.Secondly,the 1D-CNN model is constructed to directly study the mapping between principal components and RUL.Multiple convolution and pooling operations are applied for deep feature extraction,and the end-to-end RUL prediction of aeroengine can be realized.Experimental results show that the most effective principal component from the multiple state parameters can be obtained by PCA,and the long time series of multiple state parameters can be directly mapped to RUL by 1D-CNN,so as to improve the efficiency and accuracy of RUL prediction.Compared with other traditional models,the proposed method also has lower prediction error and better robustness.展开更多
Ultrasonic guided wave is an attractive monitoring technique for large-scale structures but is vulnerable to changes in environmental and operational conditions(EOC),which are inevitable in the normal inspection of ci...Ultrasonic guided wave is an attractive monitoring technique for large-scale structures but is vulnerable to changes in environmental and operational conditions(EOC),which are inevitable in the normal inspection of civil and mechanical structures.This paper thus presents a robust guided wave-based method for damage detection and localization under complex environmental conditions by singular value decomposition-based feature extraction and one-dimensional convolutional neural network(1D-CNN).After singular value decomposition-based feature extraction processing,a temporal robust damage index(TRDI)is extracted,and the effect of EOCs is well removed.Hence,even for the signals with a very large temperature-varying range and low signal-to-noise ratios(SNRs),the final damage detection and localization accuracy retain perfect 100%.Verifications are conducted on two different experimental datasets.The first dataset consists of guided wave signals collected from a thin aluminum plate with artificial noises,and the second is a publicly available experimental dataset of guided wave signals acquired on a composite plate with a temperature ranging from 20℃to 60℃.It is demonstrated that the proposed method can detect and localize the damage accurately and rapidly,showing great potential for application in complex and unknown EOC.展开更多
Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for India...Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively.展开更多
The micro-expression lasts for a very short time and the intensity is very subtle.Aiming at the problem of its low recognition rate,this paper proposes a new micro-expression recognition algorithm based on a three-dim...The micro-expression lasts for a very short time and the intensity is very subtle.Aiming at the problem of its low recognition rate,this paper proposes a new micro-expression recognition algorithm based on a three-dimensional convolutional neural network(3D-CNN),which can extract two-di-mensional features in spatial domain and one-dimensional features in time domain,simultaneously.The network structure design is based on the deep learning framework Keras,and the discarding method and batch normalization(BN)algorithm are effectively combined with three-dimensional vis-ual geometry group block(3D-VGG-Block)to reduce the risk of overfitting while improving training speed.Aiming at the problem of the lack of samples in the data set,two methods of image flipping and small amplitude flipping are used for data amplification.Finally,the recognition rate on the data set is as high as 69.11%.Compared with the current international average micro-expression recog-nition rate of about 67%,the proposed algorithm has obvious advantages in recognition rate.展开更多
Emotion recognition from speech data is an active and emerging area of research that plays an important role in numerous applications,such as robotics,virtual reality,behavior assessments,and emergency call centers.Re...Emotion recognition from speech data is an active and emerging area of research that plays an important role in numerous applications,such as robotics,virtual reality,behavior assessments,and emergency call centers.Recently,researchers have developed many techniques in this field in order to ensure an improvement in the accuracy by utilizing several deep learning approaches,but the recognition rate is still not convincing.Our main aim is to develop a new technique that increases the recognition rate with reasonable cost computations.In this paper,we suggested a new technique,which is a one-dimensional dilated convolutional neural network(1D-DCNN)for speech emotion recognition(SER)that utilizes the hierarchical features learning blocks(HFLBs)with a bi-directional gated recurrent unit(BiGRU).We designed a one-dimensional CNN network to enhance the speech signals,which uses a spectral analysis,and to extract the hidden patterns from the speech signals that are fed into a stacked one-dimensional dilated network that are called HFLBs.Each HFLB contains one dilated convolution layer(DCL),one batch normalization(BN),and one leaky_relu(Relu)layer in order to extract the emotional features using a hieratical correlation strategy.Furthermore,the learned emotional features are feed into a BiGRU in order to adjust the global weights and to recognize the temporal cues.The final state of the deep BiGRU is passed from a softmax classifier in order to produce the probabilities of the emotions.The proposed model was evaluated over three benchmarked datasets that included the IEMOCAP,EMO-DB,and RAVDESS,which achieved 72.75%,91.14%,and 78.01%accuracy,respectively.展开更多
Integrated with sensors,processors,and radio frequency(RF)communication modules,intelligent bearing could achieve the autonomous perception and autonomous decision-making,guarantying the safety and reliability during ...Integrated with sensors,processors,and radio frequency(RF)communication modules,intelligent bearing could achieve the autonomous perception and autonomous decision-making,guarantying the safety and reliability during their use.However,because of the resource limitations of the end device,processors in the intelligent bearing are unable to carry the computational load of deep learning models like convolutional neural network(CNN),which involves a great amount of multiplicative operations.To minimize the computation cost of the conventional CNN,based on the idea of AdderNet,a 1-D adder neural network with a wide first-layer kernel(WAddNN)suitable for bearing fault diagnosis is proposed in this paper.The proposed method uses the l1-norm distance between filters and input features as the output response,thus making the whole network almost free of multiplicative operations.The whole model takes the original signal as the input,uses a wide kernel in the first adder layer to extract features and suppress the high frequency noise,and then uses two layers of small kernels for nonlinear mapping.Through experimental comparison with CNN models of the same structure,WAddNN is able to achieve a similar accuracy as CNN models with significantly reduced computational cost.The proposed model provides a new fault diagnosis method for intelligent bearings with limited resources.展开更多
Pulse pile-up is a problem in nuclear spectroscopy and nuclear reaction studies that occurs when two pulses overlap and distort each other,degrading the quality of energy and timing information.Different methods have ...Pulse pile-up is a problem in nuclear spectroscopy and nuclear reaction studies that occurs when two pulses overlap and distort each other,degrading the quality of energy and timing information.Different methods have been used for pile-up rejection,both digital and analogue,but some pile-up events may contain pulses of interest and need to be reconstructed.The paper proposes a new method for reconstructing pile-up events acquired with a neutron detector array(NEDA)using an one-dimensional convolutional autoencoder(1D-CAE).The datasets for training and testing the 1D-CAE are created from data acquired from the NEDA.The new pile-up signal reconstruction method is evaluated from the point of view of how similar the reconstructed signals are to the original ones.Furthermore,it is analysed considering the result of the neutron-gamma discrimination based on charge comparison,comparing the result obtained from original and reconstructed signals.展开更多
Effective features are essential for fault diagnosis.Due to the faint characteristics of a single line-to-ground(SLG)fault,fault line detection has become a challenge in resonant grounding distribution systems.This pa...Effective features are essential for fault diagnosis.Due to the faint characteristics of a single line-to-ground(SLG)fault,fault line detection has become a challenge in resonant grounding distribution systems.This paper proposes a novel fault line detection method using waveform fusion and one-dimensional convolutional neural networks(1-D CNN).After an SLG fault occurs,the first-half waves of zero-sequence currents are collected and superimposed with each other to achieve waveform fusion.The compelling feature of fused waveforms is extracted by 1-D CNN to determine whether the fused waveform source contains the fault line.Then,the 1-D CNN output is used to update the value of the counter in order to identify the fault line.Given the lack of fault data in existing distribution systems,the proposed method only needs a small quantity of data for model training and fault line detection.In addition,the proposed method owns fault-tolerant performance.Even if a few samples are misjudged,the fault line can still be detected correctly based on the full output results of 1-D CNN.Experimental results verified that the proposed method can work effectively under various fault conditions.展开更多
Recognition of dynamic hand gestures in real-time is a difficult task because the system can never know when or from where the gesture starts and ends in a video stream.Many researchers have been working on visionbase...Recognition of dynamic hand gestures in real-time is a difficult task because the system can never know when or from where the gesture starts and ends in a video stream.Many researchers have been working on visionbased gesture recognition due to its various applications.This paper proposes a deep learning architecture based on the combination of a 3D Convolutional Neural Network(3D-CNN)and a Long Short-Term Memory(LSTM)network.The proposed architecture extracts spatial-temporal information from video sequences input while avoiding extensive computation.The 3D-CNN is used for the extraction of spectral and spatial features which are then given to the LSTM network through which classification is carried out.The proposed model is a light-weight architecture with only 3.7 million training parameters.The model has been evaluated on 15 classes from the 20BN-jester dataset available publicly.The model was trained on 2000 video-clips per class which were separated into 80%training and 20%validation sets.An accuracy of 99%and 97%was achieved on training and testing data,respectively.We further show that the combination of 3D-CNN with LSTM gives superior results as compared to MobileNetv2+LSTM.展开更多
Memristor-based neuromorphic computing shows great potential for high-speed and high-throughput signal processing applications,such as electroencephalogram(EEG)signal processing.Nonetheless,the size of one-transistor ...Memristor-based neuromorphic computing shows great potential for high-speed and high-throughput signal processing applications,such as electroencephalogram(EEG)signal processing.Nonetheless,the size of one-transistor one-resistor(1T1R)memristor arrays is limited by the non-ideality of the devices,which prevents the hardware implementation of large and complex networks.In this work,we propose the depthwise separable convolution and bidirectional gate recurrent unit(DSC-BiGRU)network,a lightweight and highly robust hybrid neural network based on 1T1R arrays that enables efficient processing of EEG signals in the temporal,frequency and spatial domains by hybridizing DSC and BiGRU blocks.The network size is reduced and the network robustness is improved while ensuring the network classification accuracy.In the simulation,the measured non-idealities of the 1T1R array are brought into the network through statistical analysis.Compared with traditional convolutional networks,the network parameters are reduced by 95%and the network classification accuracy is improved by 21%at a 95%array yield rate and 5%tolerable error.This work demonstrates that lightweight and highly robust networks based on memristor arrays hold great promise for applications that rely on low consumption and high efficiency.展开更多
In order to solve the problem of low accuracy of construction project duration prediction, this paper proposes a CNN attention BP combination model </span><span style="font-family:"white-space:...In order to solve the problem of low accuracy of construction project duration prediction, this paper proposes a CNN attention BP combination model </span><span style="font-family:"white-space:normal;">project risk prediction model based on attention mechanism, one-dimensional </span><span style="font-family:"white-space:normal;">convolutional neural network (1d-cnn) and BP neural network. Firstly, the literature analysis method is used to select the risk evaluation index value of construction project, and the attention mechanism is used to determine the weight of risk factors on construction period prediction;then, BP neural network is used to predict the project duration, and accuracy, cross entropy loss function and F1 score are selected to comprehensively evaluate the performance of 1d-cnn-attention-bp combined model. The experimental results show that the duration risk prediction accuracy of the risk prediction model proposed in this paper is more than 90%, which can meet the risk prediction of construction projects with high accuracy.展开更多
In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the e...In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.展开更多
In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model...In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model with 1DCNN-attention network and the enhanced preprocessing techniques is proposed for loan approval prediction. Our proposed model consists of the enhanced data preprocessing and stacking of multiple hybrid modules. Initially, the enhanced data preprocessing techniques using a combination of methods such as standardization, SMOTE oversampling, feature construction, recursive feature elimination (RFE), information value (IV) and principal component analysis (PCA), which not only eliminates the effects of data jitter and non-equilibrium, but also removes redundant features while improving the representation of features. Subsequently, a hybrid module that combines a 1DCNN with an attention mechanism is proposed to extract local and global spatio-temporal features. Finally, the comprehensive experiments conducted validate that the proposed model surpasses state-of-the-art baseline models across various performance metrics, including accuracy, precision, recall, F1 score, and AUC. Our proposed model helps to automate the loan approval process and provides scientific guidance to financial institutions for loan risk control.展开更多
Growing demand for seafood and reduced fishery harvests have raised intensive farming of marine aquaculture in coastal regions,which may cause severe coastal water problems without adequate environmental management.Ef...Growing demand for seafood and reduced fishery harvests have raised intensive farming of marine aquaculture in coastal regions,which may cause severe coastal water problems without adequate environmental management.Effective mapping of mariculture areas is essential for the protection of coastal environments.However,due to the limited spatial coverage and complex structures,it is still challenging for traditional methods to accurately extract mariculture areas from medium spatial resolution(MSR)images.To solve this problem,we propose to use the full resolution cascade convolutional neural network(FRCNet),which maintains effective features over the whole training process,to identify mariculture areas from MSR images.Specifically,the FRCNet uses a sequential full resolution neural network as the first-level subnetwork,and gradually aggregates higher-level subnetworks in a cascade way.Meanwhile,we perform a repeated fusion strategy so that features can receive information from different subnetworks simultaneously,leading to rich and representative features.As a result,FRCNet can effectively recognize different kinds of mariculture areas from MSR images.Results show that FRCNet obtained better performance than other classical and recently proposed methods.Our developed methods can provide valuable datasets for large-scale and intelligent modeling of the marine aquaculture management and coastal zone planning.展开更多
Due to the fact that the vibration signal of the rotating machine is one-dimensional and the large-scale convolution kernel can obtain a better perception field, on the basis of the classical convolution neural networ...Due to the fact that the vibration signal of the rotating machine is one-dimensional and the large-scale convolution kernel can obtain a better perception field, on the basis of the classical convolution neural network model(LetNet-5), one-dimensional large-kernel convolution neural network(1 DLCNN) is designed. Since the hyper-parameters of 1 DLCNN have a greater impact on network performance, the genetic algorithm(GA) is used to optimize the hyper-parameters, and the method of optimizing the parameters of 1 DLCNN by the genetic algorithm is named GA-1 DLCNN. The experimental results show that the optimal network model based on the GA-1 DLCNN method can achieve 99.9% fault diagnosis accuracy, which is much higher than those of other traditional fault diagnosis methods. In addition, the 1 DLCNN is compared with one-dimencional small-kernel convolution neural network(1 DSCNN) and the classical two-dimensional convolution neural network model. The input sample lengths are set to be 128, 256, 512, 1 024, and 2 048, respectively, and the final diagnostic accuracy results and the visual scatter plot show that the effect of 1 DLCNN is optimal.展开更多
基金supported by the Key Research and Development Program of Jiangsu Province under Grant BE2022059-3,CTBC Bank through the Industry-Academia Cooperation Project,as well as by the Ministry of Science and Technology of Taiwan through Grants MOST-108-2218-E-002-055,MOST-109-2223-E-009-002-MY3,MOST-109-2218-E-009-025,and MOST431109-2218-E-002-015.
文摘Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been proposed.However,unlike DNNs,shallow convolutional neural networks often outperform deeper models in mitigating overfitting,particularly with small datasets.Still,many of these methods rely on a single feature for recognition,resulting in an insufficient ability to extract highly effective features.To address this limitation,in this paper,an Improved Dual-stream Shallow Convolutional Neural Network based on an Extreme Gradient Boosting Algorithm(IDSSCNN-XgBoost)is introduced for ME Recognition.The proposed method utilizes a dual-stream architecture where motion vectors(temporal features)are extracted using Optical Flow TV-L1 and amplify subtle changes(spatial features)via EulerianVideoMagnification(EVM).These features are processed by IDSSCNN,with an attention mechanism applied to refine the extracted effective features.The outputs are then fused,concatenated,and classified using the XgBoost algorithm.This comprehensive approach significantly improves recognition accuracy by leveraging the strengths of both temporal and spatial information,supported by the robust classification power of XgBoost.The proposed method is evaluated on three publicly available ME databases named Chinese Academy of Sciences Micro-expression Database(CASMEII),Spontaneous Micro-Expression Database(SMICHS),and Spontaneous Actions and Micro-Movements(SAMM).Experimental results indicate that the proposed model can achieve outstanding results compared to recent models.The accuracy results are 79.01%,69.22%,and 68.99%on CASMEII,SMIC-HS,and SAMM,and the F1-score are 75.47%,68.91%,and 63.84%,respectively.The proposed method has the advantage of operational efficiency and less computational time.
基金supported by Jiangsu Social Science Foundation(No.20GLD008)Science,Technology Projects of Jiangsu Provincial Department of Communications(No.2020Y14)Joint Fund for Civil Aviation Research(No.U1933202)。
文摘In order to directly construct the mapping between multiple state parameters and remaining useful life(RUL),and reduce the interference of random error on prediction accuracy,a RUL prediction model of aeroengine based on principal component analysis(PCA)and one-dimensional convolution neural network(1D-CNN)is proposed in this paper.Firstly,multiple state parameters corresponding to massive cycles of aeroengine are collected and brought into PCA for dimensionality reduction,and principal components are extracted for further time series prediction.Secondly,the 1D-CNN model is constructed to directly study the mapping between principal components and RUL.Multiple convolution and pooling operations are applied for deep feature extraction,and the end-to-end RUL prediction of aeroengine can be realized.Experimental results show that the most effective principal component from the multiple state parameters can be obtained by PCA,and the long time series of multiple state parameters can be directly mapped to RUL by 1D-CNN,so as to improve the efficiency and accuracy of RUL prediction.Compared with other traditional models,the proposed method also has lower prediction error and better robustness.
基金Supported by National Natural Science Foundation of China(Grant Nos.52272433 and 11874110)Jiangsu Provincial Key R&D Program(Grant No.BE2021084)Technical Support Special Project of State Administration for Market Regulation(Grant No.2022YJ11).
文摘Ultrasonic guided wave is an attractive monitoring technique for large-scale structures but is vulnerable to changes in environmental and operational conditions(EOC),which are inevitable in the normal inspection of civil and mechanical structures.This paper thus presents a robust guided wave-based method for damage detection and localization under complex environmental conditions by singular value decomposition-based feature extraction and one-dimensional convolutional neural network(1D-CNN).After singular value decomposition-based feature extraction processing,a temporal robust damage index(TRDI)is extracted,and the effect of EOCs is well removed.Hence,even for the signals with a very large temperature-varying range and low signal-to-noise ratios(SNRs),the final damage detection and localization accuracy retain perfect 100%.Verifications are conducted on two different experimental datasets.The first dataset consists of guided wave signals collected from a thin aluminum plate with artificial noises,and the second is a publicly available experimental dataset of guided wave signals acquired on a composite plate with a temperature ranging from 20℃to 60℃.It is demonstrated that the proposed method can detect and localize the damage accurately and rapidly,showing great potential for application in complex and unknown EOC.
文摘Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively.
基金Supported by the Shaanxi Province Key Research and Development Project(No.2021GY-280)Shaanxi Province Natural Science Basic Re-search Program Project(No.2021JM-459)+1 种基金the National Natural Science Foundation of China(No.61834005,61772417,61802304,61602377,61634004)the Shaanxi Province International Science and Technology Cooperation Project(No.2018KW-006).
文摘The micro-expression lasts for a very short time and the intensity is very subtle.Aiming at the problem of its low recognition rate,this paper proposes a new micro-expression recognition algorithm based on a three-dimensional convolutional neural network(3D-CNN),which can extract two-di-mensional features in spatial domain and one-dimensional features in time domain,simultaneously.The network structure design is based on the deep learning framework Keras,and the discarding method and batch normalization(BN)algorithm are effectively combined with three-dimensional vis-ual geometry group block(3D-VGG-Block)to reduce the risk of overfitting while improving training speed.Aiming at the problem of the lack of samples in the data set,two methods of image flipping and small amplitude flipping are used for data amplification.Finally,the recognition rate on the data set is as high as 69.11%.Compared with the current international average micro-expression recog-nition rate of about 67%,the proposed algorithm has obvious advantages in recognition rate.
基金supported by the National Research Foundation of Korea funded by the Korean Government through the Ministry of Science and ICT under Grant NRF-2020R1F1A1060659 and in part by the 2020 Faculty Research Fund of Sejong University。
文摘Emotion recognition from speech data is an active and emerging area of research that plays an important role in numerous applications,such as robotics,virtual reality,behavior assessments,and emergency call centers.Recently,researchers have developed many techniques in this field in order to ensure an improvement in the accuracy by utilizing several deep learning approaches,but the recognition rate is still not convincing.Our main aim is to develop a new technique that increases the recognition rate with reasonable cost computations.In this paper,we suggested a new technique,which is a one-dimensional dilated convolutional neural network(1D-DCNN)for speech emotion recognition(SER)that utilizes the hierarchical features learning blocks(HFLBs)with a bi-directional gated recurrent unit(BiGRU).We designed a one-dimensional CNN network to enhance the speech signals,which uses a spectral analysis,and to extract the hidden patterns from the speech signals that are fed into a stacked one-dimensional dilated network that are called HFLBs.Each HFLB contains one dilated convolution layer(DCL),one batch normalization(BN),and one leaky_relu(Relu)layer in order to extract the emotional features using a hieratical correlation strategy.Furthermore,the learned emotional features are feed into a BiGRU in order to adjust the global weights and to recognize the temporal cues.The final state of the deep BiGRU is passed from a softmax classifier in order to produce the probabilities of the emotions.The proposed model was evaluated over three benchmarked datasets that included the IEMOCAP,EMO-DB,and RAVDESS,which achieved 72.75%,91.14%,and 78.01%accuracy,respectively.
基金support provided by the China National Key Research and Development Program of China under Grant 2019YFB2004300the National Natural Science Foundation of China under Grant 51975065 and 51805051.
文摘Integrated with sensors,processors,and radio frequency(RF)communication modules,intelligent bearing could achieve the autonomous perception and autonomous decision-making,guarantying the safety and reliability during their use.However,because of the resource limitations of the end device,processors in the intelligent bearing are unable to carry the computational load of deep learning models like convolutional neural network(CNN),which involves a great amount of multiplicative operations.To minimize the computation cost of the conventional CNN,based on the idea of AdderNet,a 1-D adder neural network with a wide first-layer kernel(WAddNN)suitable for bearing fault diagnosis is proposed in this paper.The proposed method uses the l1-norm distance between filters and input features as the output response,thus making the whole network almost free of multiplicative operations.The whole model takes the original signal as the input,uses a wide kernel in the first adder layer to extract features and suppress the high frequency noise,and then uses two layers of small kernels for nonlinear mapping.Through experimental comparison with CNN models of the same structure,WAddNN is able to achieve a similar accuracy as CNN models with significantly reduced computational cost.The proposed model provides a new fault diagnosis method for intelligent bearings with limited resources.
基金partially supported by MICIU MCIN/AEI/10.13039/501100011033Spain with grant PID2020-118265GB-C42,-C44,PRTR-C17.I01+1 种基金Generalitat Valenciana,Spain with grant CIPROM/2022/54,ASFAE/2022/031,CIAPOS/2021/114the EU NextGenerationEU,ESF funds,and the National Science Centre (NCN),Poland (grant No.2020/39/D/ST2/00466)
文摘Pulse pile-up is a problem in nuclear spectroscopy and nuclear reaction studies that occurs when two pulses overlap and distort each other,degrading the quality of energy and timing information.Different methods have been used for pile-up rejection,both digital and analogue,but some pile-up events may contain pulses of interest and need to be reconstructed.The paper proposes a new method for reconstructing pile-up events acquired with a neutron detector array(NEDA)using an one-dimensional convolutional autoencoder(1D-CAE).The datasets for training and testing the 1D-CAE are created from data acquired from the NEDA.The new pile-up signal reconstruction method is evaluated from the point of view of how similar the reconstructed signals are to the original ones.Furthermore,it is analysed considering the result of the neutron-gamma discrimination based on charge comparison,comparing the result obtained from original and reconstructed signals.
基金supported by the National Natural Science Foundation of China through the Project of Research of Flexible and Adaptive Arc-Suppression Method for Single-Phase Grounding Fault in Distribution Networks(No.51677030).
文摘Effective features are essential for fault diagnosis.Due to the faint characteristics of a single line-to-ground(SLG)fault,fault line detection has become a challenge in resonant grounding distribution systems.This paper proposes a novel fault line detection method using waveform fusion and one-dimensional convolutional neural networks(1-D CNN).After an SLG fault occurs,the first-half waves of zero-sequence currents are collected and superimposed with each other to achieve waveform fusion.The compelling feature of fused waveforms is extracted by 1-D CNN to determine whether the fused waveform source contains the fault line.Then,the 1-D CNN output is used to update the value of the counter in order to identify the fault line.Given the lack of fault data in existing distribution systems,the proposed method only needs a small quantity of data for model training and fault line detection.In addition,the proposed method owns fault-tolerant performance.Even if a few samples are misjudged,the fault line can still be detected correctly based on the full output results of 1-D CNN.Experimental results verified that the proposed method can work effectively under various fault conditions.
文摘Recognition of dynamic hand gestures in real-time is a difficult task because the system can never know when or from where the gesture starts and ends in a video stream.Many researchers have been working on visionbased gesture recognition due to its various applications.This paper proposes a deep learning architecture based on the combination of a 3D Convolutional Neural Network(3D-CNN)and a Long Short-Term Memory(LSTM)network.The proposed architecture extracts spatial-temporal information from video sequences input while avoiding extensive computation.The 3D-CNN is used for the extraction of spectral and spatial features which are then given to the LSTM network through which classification is carried out.The proposed model is a light-weight architecture with only 3.7 million training parameters.The model has been evaluated on 15 classes from the 20BN-jester dataset available publicly.The model was trained on 2000 video-clips per class which were separated into 80%training and 20%validation sets.An accuracy of 99%and 97%was achieved on training and testing data,respectively.We further show that the combination of 3D-CNN with LSTM gives superior results as compared to MobileNetv2+LSTM.
基金Project supported by the National Key Research and Development Program of China(Grant No.2019YFB2205102)the National Natural Science Foundation of China(Grant Nos.61974164,62074166,61804181,62004219,62004220,and 62104256).
文摘Memristor-based neuromorphic computing shows great potential for high-speed and high-throughput signal processing applications,such as electroencephalogram(EEG)signal processing.Nonetheless,the size of one-transistor one-resistor(1T1R)memristor arrays is limited by the non-ideality of the devices,which prevents the hardware implementation of large and complex networks.In this work,we propose the depthwise separable convolution and bidirectional gate recurrent unit(DSC-BiGRU)network,a lightweight and highly robust hybrid neural network based on 1T1R arrays that enables efficient processing of EEG signals in the temporal,frequency and spatial domains by hybridizing DSC and BiGRU blocks.The network size is reduced and the network robustness is improved while ensuring the network classification accuracy.In the simulation,the measured non-idealities of the 1T1R array are brought into the network through statistical analysis.Compared with traditional convolutional networks,the network parameters are reduced by 95%and the network classification accuracy is improved by 21%at a 95%array yield rate and 5%tolerable error.This work demonstrates that lightweight and highly robust networks based on memristor arrays hold great promise for applications that rely on low consumption and high efficiency.
文摘In order to solve the problem of low accuracy of construction project duration prediction, this paper proposes a CNN attention BP combination model </span><span style="font-family:"white-space:normal;">project risk prediction model based on attention mechanism, one-dimensional </span><span style="font-family:"white-space:normal;">convolutional neural network (1d-cnn) and BP neural network. Firstly, the literature analysis method is used to select the risk evaluation index value of construction project, and the attention mechanism is used to determine the weight of risk factors on construction period prediction;then, BP neural network is used to predict the project duration, and accuracy, cross entropy loss function and F1 score are selected to comprehensively evaluate the performance of 1d-cnn-attention-bp combined model. The experimental results show that the duration risk prediction accuracy of the risk prediction model proposed in this paper is more than 90%, which can meet the risk prediction of construction projects with high accuracy.
文摘In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.
文摘In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model with 1DCNN-attention network and the enhanced preprocessing techniques is proposed for loan approval prediction. Our proposed model consists of the enhanced data preprocessing and stacking of multiple hybrid modules. Initially, the enhanced data preprocessing techniques using a combination of methods such as standardization, SMOTE oversampling, feature construction, recursive feature elimination (RFE), information value (IV) and principal component analysis (PCA), which not only eliminates the effects of data jitter and non-equilibrium, but also removes redundant features while improving the representation of features. Subsequently, a hybrid module that combines a 1DCNN with an attention mechanism is proposed to extract local and global spatio-temporal features. Finally, the comprehensive experiments conducted validate that the proposed model surpasses state-of-the-art baseline models across various performance metrics, including accuracy, precision, recall, F1 score, and AUC. Our proposed model helps to automate the loan approval process and provides scientific guidance to financial institutions for loan risk control.
基金supported by the National Natural Science Foundation of China[grant numbers 42101404,42107498]the National Key Research and Development Program of China[grant number 2020YFC1807501].
文摘Growing demand for seafood and reduced fishery harvests have raised intensive farming of marine aquaculture in coastal regions,which may cause severe coastal water problems without adequate environmental management.Effective mapping of mariculture areas is essential for the protection of coastal environments.However,due to the limited spatial coverage and complex structures,it is still challenging for traditional methods to accurately extract mariculture areas from medium spatial resolution(MSR)images.To solve this problem,we propose to use the full resolution cascade convolutional neural network(FRCNet),which maintains effective features over the whole training process,to identify mariculture areas from MSR images.Specifically,the FRCNet uses a sequential full resolution neural network as the first-level subnetwork,and gradually aggregates higher-level subnetworks in a cascade way.Meanwhile,we perform a repeated fusion strategy so that features can receive information from different subnetworks simultaneously,leading to rich and representative features.As a result,FRCNet can effectively recognize different kinds of mariculture areas from MSR images.Results show that FRCNet obtained better performance than other classical and recently proposed methods.Our developed methods can provide valuable datasets for large-scale and intelligent modeling of the marine aquaculture management and coastal zone planning.
基金The National Natural Science Foundation of China(No.51675098)
文摘Due to the fact that the vibration signal of the rotating machine is one-dimensional and the large-scale convolution kernel can obtain a better perception field, on the basis of the classical convolution neural network model(LetNet-5), one-dimensional large-kernel convolution neural network(1 DLCNN) is designed. Since the hyper-parameters of 1 DLCNN have a greater impact on network performance, the genetic algorithm(GA) is used to optimize the hyper-parameters, and the method of optimizing the parameters of 1 DLCNN by the genetic algorithm is named GA-1 DLCNN. The experimental results show that the optimal network model based on the GA-1 DLCNN method can achieve 99.9% fault diagnosis accuracy, which is much higher than those of other traditional fault diagnosis methods. In addition, the 1 DLCNN is compared with one-dimencional small-kernel convolution neural network(1 DSCNN) and the classical two-dimensional convolution neural network model. The input sample lengths are set to be 128, 256, 512, 1 024, and 2 048, respectively, and the final diagnostic accuracy results and the visual scatter plot show that the effect of 1 DLCNN is optimal.