Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decode...Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decoder (ACSF-ED) network to predict the action and locate the object efficiently. In the Adaptive Cross-Scale Fusion Spatio-Temporal Encoder (ACSF ST-Encoder), the Asymptotic Cross-scale Feature-fusion Module (ACCFM) is designed to address the issue of information degradation caused by the propagation of high-level semantic information, thereby extracting high-quality multi-scale features to provide superior features for subsequent spatio-temporal information modeling. Within the Shared-Head Decoder structure, a shared classification and regression detection head is constructed. A multi-constraint loss function composed of one-to-one, one-to-many, and contrastive denoising losses is designed to address the problem of insufficient constraint force in predicting results with traditional methods. This loss function enhances the accuracy of model classification predictions and improves the proximity of regression position predictions to ground truth objects. The proposed method model is evaluated on the popular dataset UCF101-24 and JHMDB-21. Experimental results demonstrate that the proposed method achieves an accuracy of 81.52% on the Frame-mAP metric, surpassing current existing methods.展开更多
为解决传统特高压直流保护对高阻故障检测准确率不高、故障检测时间过长以及故障选极不完善的问题,提出基于长短时记忆(long short term memory,LSTM)循环神经网络(recurrent neural network,RNN)的特高压直流输电线路继电保护故障检测...为解决传统特高压直流保护对高阻故障检测准确率不高、故障检测时间过长以及故障选极不完善的问题,提出基于长短时记忆(long short term memory,LSTM)循环神经网络(recurrent neural network,RNN)的特高压直流输电线路继电保护故障检测方法。首先,基于快速傅里叶变换分析特高压直流输电系统暂态故障特征,使用相模变换和小波变换提取出故障特征量作为输入数据。其次,将输入数据输入到LSTM-RNN中进行前向传播,对系统故障特征进行深度学习,同时使用反向传播方式更新网络参数,将深层的特征量输入到Softmax分类器中进行分类,把故障识别分成区外故障、母线故障和线路故障,故障分类为正极故障、负极故障和双极故障,并输出识别结果。最后,在PSCAD/EMTDC仿真条件下,搭建特高压直流输电模型。验证结果表明:所提的方法在特高压直流输电线路继电保护的故障检测、故障选极上具有更好的效果,相比于人工神经网络、卷积神经网络、支持向量机,故障识别准确率分别提升4.71%、6.57%、9.32%。展开更多
本文以中石油股份为例,聚焦于股票价格预测,运用RNN模型与LSTM模型展开深入研究。使用RNN模型进行预测时,由于模型本身存在梯度消失或梯度爆炸的问题,其在处理长序列股价数据时存在显著缺陷,致使其难以捕捉股票价格序列中的长期依赖关系...本文以中石油股份为例,聚焦于股票价格预测,运用RNN模型与LSTM模型展开深入研究。使用RNN模型进行预测时,由于模型本身存在梯度消失或梯度爆炸的问题,其在处理长序列股价数据时存在显著缺陷,致使其难以捕捉股票价格序列中的长期依赖关系,在面对包含长期趋势、季节性变化的股价数据时表现欠佳。鉴于此,引入LSTM模型,该模型凭借独特的输入门、遗忘门和输出门机制,有效解决了长期依赖难题,能够选择性地记忆或遗忘信息,从而有效处理长序列数据。实验结果有力证实了LSTM模型不仅能精准模拟股价的真实走向,而且在模型评价指标上全面优于RNN模型。综上,LSTM模型在中石油股价预测领域展现出卓越的效果,相较于RNN模型更适用于股票预测任务。This study takes PetroChina Company Limited as an example, focuses on stock price prediction, and conducts an in-depth study using the RNN model and the LSTM model. When using the RNN model for prediction, due to the problems of gradient vanishing or gradient explosion in the model itself, it has significant defects in processing long-sequence stock price data. This makes it difficult for the RNN model to capture the long-term dependencies in the stock price sequence, and it performs poorly when dealing with stock price data containing long-term trends and seasonal changes. In view of this, the LSTM model is introduced. With its unique mechanisms of input gate, forget gate and output gate, the LSTM model effectively solves the problem of long-term dependencies. It can selectively remember or forget information, thus effectively handling long-sequence data. The experimental results strongly confirm that the LSTM model can not only accurately simulate the real trend of stock prices, but also comprehensively outperforms the RNN model in terms of model evaluation indicators. In conclusion, the LSTM model shows excellent results in the field of predicting PetroChina’s stock price and is more suitable for stock prediction tasks compared with the RNN model.展开更多
According to the characteristics of the road features,an Encoder-Decoder deep semantic segmentation network is designed for the road extraction of remote sensing images.Firstly,as the features of the road target are r...According to the characteristics of the road features,an Encoder-Decoder deep semantic segmentation network is designed for the road extraction of remote sensing images.Firstly,as the features of the road target are rich in local details and simple in semantic features,an Encoder-Decoder network with shallow layers and high resolution is designed to improve the ability to represent detail information.Secondly,as the road area is a small proportion in remote sensing images,the cross-entropy loss function is improved,which solves the imbalance between positive and negative samples in the training process.Experiments on large road extraction datasets show that the proposed method gets the recall rate 83.9%,precision 82.5%and F1-score 82.9%,which can extract the road targets in remote sensing images completely and accurately.The Encoder-Decoder network designed in this paper performs well in the road extraction task and needs less artificial participation,so it has a good application prospect.展开更多
The development of multimedia content has resulted in a massiveincrease in network traffic for video streaming. It demands such types ofsolutions that can be addressed to obtain the user’s Quality-of-Experience(QoE)....The development of multimedia content has resulted in a massiveincrease in network traffic for video streaming. It demands such types ofsolutions that can be addressed to obtain the user’s Quality-of-Experience(QoE). 360-degree videos have already taken up the user’s behavior by storm.However, the users only focus on the part of 360-degree videos, known as aviewport. Despite the immense hype, 360-degree videos convey a loathsomeside effect about viewport prediction, making viewers feel uncomfortablebecause user viewport needs to be pre-fetched in advance. Ideally, we canminimize the bandwidth consumption if we know what the user motionin advance. Looking into the problem definition, we propose an EncoderDecoder based Long-Short Term Memory (LSTM) model to more accuratelycapture the non-linear relationship between past and future viewport positions. This model takes the transforming data instead of taking the direct inputto predict the future user movement. Then, this prediction model is combinedwith a rate adaptation approach that assigns the bitrates to various tiles for360-degree video frames under a given network capacity. Hence, our proposedwork aims to facilitate improved system performance when QoE parametersare jointly optimized. Some experiments were carried out and compared withexisting work to prove the performance of the proposed model. Last but notleast, the experiments implementation of our proposed work provides highuser’s QoE than its competitors.展开更多
As a common and high-risk type of disease,heart disease seriously threatens people’s health.At the same time,in the era of the Internet of Thing(IoT),smart medical device has strong practical significance for medical...As a common and high-risk type of disease,heart disease seriously threatens people’s health.At the same time,in the era of the Internet of Thing(IoT),smart medical device has strong practical significance for medical workers and patients because of its ability to assist in the diagnosis of diseases.Therefore,the research of real-time diagnosis and classification algorithms for arrhythmia can help to improve the diagnostic efficiency of diseases.In this paper,we design an automatic arrhythmia classification algorithm model based on Convolutional Neural Network(CNN)and Encoder-Decoder model.The model uses Long Short-Term Memory(LSTM)to consider the influence of time series features on classification results.Simultaneously,it is trained and tested by the MIT-BIH arrhythmia database.Besides,Generative Adversarial Networks(GAN)is adopted as a method of data equalization for solving data imbalance problem.The simulation results show that for the inter-patient arrhythmia classification,the hybrid model combining CNN and Encoder-Decoder model has the best classification accuracy,of which the accuracy can reach 94.05%.Especially,it has a better advantage for the classification effect of supraventricular ectopic beats(class S)and fusion beats(class F).展开更多
Noise reduction analysis of signals is essential for modern underwater acoustic detection systems.The traditional noise reduction techniques gradually lose efficacy because the target signal is masked by biological an...Noise reduction analysis of signals is essential for modern underwater acoustic detection systems.The traditional noise reduction techniques gradually lose efficacy because the target signal is masked by biological and natural noise in the marine environ-ment.The feature extraction method combining time-frequency spectrograms and deep learning can effectively achieve the separation of noise and target signals.A fully convolutional encoder-decoder neural network(FCEDN)is proposed to address the issue of noise reduc-tion in underwater acoustic signals.The time-domain waveform map of underwater acoustic signals is converted into a wavelet low-frequency analysis recording spectrogram during the denoising process to preserve as many underwater acoustic signal characteristics as possible.The FCEDN is built to learn the spectrogram mapping between noise and target signals that can be learned at each time level.The transposed convolution transforms are introduced,which can transform the spectrogram features of the signals into listenable audio files.After evaluating the systems on the ShipsEar Dataset,the proposed method can increase SNR and SI-SNR by 10.02 and 9.5dB,re-spectively.展开更多
Tunnel boring machines(TBMs)have been widely utilised in tunnel construction due to their high efficiency and reliability.Accurately predicting TBM performance can improve project time management,cost control,and risk...Tunnel boring machines(TBMs)have been widely utilised in tunnel construction due to their high efficiency and reliability.Accurately predicting TBM performance can improve project time management,cost control,and risk management.This study aims to use deep learning to develop real-time models for predicting the penetration rate(PR).The models are built using data from the Changsha metro project,and their performances are evaluated using unseen data from the Zhengzhou Metro project.In one-step forecast,the predicted penetration rate follows the trend of the measured penetration rate in both training and testing.The autoregressive integrated moving average(ARIMA)model is compared with the recurrent neural network(RNN)model.The results show that univariate models,which only consider historical penetration rate itself,perform better than multivariate models that take into account multiple geological and operational parameters(GEO and OP).Next,an RNN variant combining time series of penetration rate with the last-step geological and operational parameters is developed,and it performs better than other models.A sensitivity analysis shows that the penetration rate is the most important parameter,while other parameters have a smaller impact on time series forecasting.It is also found that smoothed data are easier to predict with high accuracy.Nevertheless,over-simplified data can lose real characteristics in time series.In conclusion,the RNN variant can accurately predict the next-step penetration rate,and data smoothing is crucial in time series forecasting.This study provides practical guidance for TBM performance forecasting in practical engineering.展开更多
基金support for this work was supported by Key Lab of Intelligent and Green Flexographic Printing under Grant ZBKT202301.
文摘Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decoder (ACSF-ED) network to predict the action and locate the object efficiently. In the Adaptive Cross-Scale Fusion Spatio-Temporal Encoder (ACSF ST-Encoder), the Asymptotic Cross-scale Feature-fusion Module (ACCFM) is designed to address the issue of information degradation caused by the propagation of high-level semantic information, thereby extracting high-quality multi-scale features to provide superior features for subsequent spatio-temporal information modeling. Within the Shared-Head Decoder structure, a shared classification and regression detection head is constructed. A multi-constraint loss function composed of one-to-one, one-to-many, and contrastive denoising losses is designed to address the problem of insufficient constraint force in predicting results with traditional methods. This loss function enhances the accuracy of model classification predictions and improves the proximity of regression position predictions to ground truth objects. The proposed method model is evaluated on the popular dataset UCF101-24 and JHMDB-21. Experimental results demonstrate that the proposed method achieves an accuracy of 81.52% on the Frame-mAP metric, surpassing current existing methods.
文摘为解决传统特高压直流保护对高阻故障检测准确率不高、故障检测时间过长以及故障选极不完善的问题,提出基于长短时记忆(long short term memory,LSTM)循环神经网络(recurrent neural network,RNN)的特高压直流输电线路继电保护故障检测方法。首先,基于快速傅里叶变换分析特高压直流输电系统暂态故障特征,使用相模变换和小波变换提取出故障特征量作为输入数据。其次,将输入数据输入到LSTM-RNN中进行前向传播,对系统故障特征进行深度学习,同时使用反向传播方式更新网络参数,将深层的特征量输入到Softmax分类器中进行分类,把故障识别分成区外故障、母线故障和线路故障,故障分类为正极故障、负极故障和双极故障,并输出识别结果。最后,在PSCAD/EMTDC仿真条件下,搭建特高压直流输电模型。验证结果表明:所提的方法在特高压直流输电线路继电保护的故障检测、故障选极上具有更好的效果,相比于人工神经网络、卷积神经网络、支持向量机,故障识别准确率分别提升4.71%、6.57%、9.32%。
文摘本文以中石油股份为例,聚焦于股票价格预测,运用RNN模型与LSTM模型展开深入研究。使用RNN模型进行预测时,由于模型本身存在梯度消失或梯度爆炸的问题,其在处理长序列股价数据时存在显著缺陷,致使其难以捕捉股票价格序列中的长期依赖关系,在面对包含长期趋势、季节性变化的股价数据时表现欠佳。鉴于此,引入LSTM模型,该模型凭借独特的输入门、遗忘门和输出门机制,有效解决了长期依赖难题,能够选择性地记忆或遗忘信息,从而有效处理长序列数据。实验结果有力证实了LSTM模型不仅能精准模拟股价的真实走向,而且在模型评价指标上全面优于RNN模型。综上,LSTM模型在中石油股价预测领域展现出卓越的效果,相较于RNN模型更适用于股票预测任务。This study takes PetroChina Company Limited as an example, focuses on stock price prediction, and conducts an in-depth study using the RNN model and the LSTM model. When using the RNN model for prediction, due to the problems of gradient vanishing or gradient explosion in the model itself, it has significant defects in processing long-sequence stock price data. This makes it difficult for the RNN model to capture the long-term dependencies in the stock price sequence, and it performs poorly when dealing with stock price data containing long-term trends and seasonal changes. In view of this, the LSTM model is introduced. With its unique mechanisms of input gate, forget gate and output gate, the LSTM model effectively solves the problem of long-term dependencies. It can selectively remember or forget information, thus effectively handling long-sequence data. The experimental results strongly confirm that the LSTM model can not only accurately simulate the real trend of stock prices, but also comprehensively outperforms the RNN model in terms of model evaluation indicators. In conclusion, the LSTM model shows excellent results in the field of predicting PetroChina’s stock price and is more suitable for stock prediction tasks compared with the RNN model.
基金National Natural Science Foundation of China(Nos.61673017,61403398)and Natural Science Foundation of Shaanxi Province(Nos.2017JM6077,2018ZDXM-GY-039)。
文摘According to the characteristics of the road features,an Encoder-Decoder deep semantic segmentation network is designed for the road extraction of remote sensing images.Firstly,as the features of the road target are rich in local details and simple in semantic features,an Encoder-Decoder network with shallow layers and high resolution is designed to improve the ability to represent detail information.Secondly,as the road area is a small proportion in remote sensing images,the cross-entropy loss function is improved,which solves the imbalance between positive and negative samples in the training process.Experiments on large road extraction datasets show that the proposed method gets the recall rate 83.9%,precision 82.5%and F1-score 82.9%,which can extract the road targets in remote sensing images completely and accurately.The Encoder-Decoder network designed in this paper performs well in the road extraction task and needs less artificial participation,so it has a good application prospect.
文摘The development of multimedia content has resulted in a massiveincrease in network traffic for video streaming. It demands such types ofsolutions that can be addressed to obtain the user’s Quality-of-Experience(QoE). 360-degree videos have already taken up the user’s behavior by storm.However, the users only focus on the part of 360-degree videos, known as aviewport. Despite the immense hype, 360-degree videos convey a loathsomeside effect about viewport prediction, making viewers feel uncomfortablebecause user viewport needs to be pre-fetched in advance. Ideally, we canminimize the bandwidth consumption if we know what the user motionin advance. Looking into the problem definition, we propose an EncoderDecoder based Long-Short Term Memory (LSTM) model to more accuratelycapture the non-linear relationship between past and future viewport positions. This model takes the transforming data instead of taking the direct inputto predict the future user movement. Then, this prediction model is combinedwith a rate adaptation approach that assigns the bitrates to various tiles for360-degree video frames under a given network capacity. Hence, our proposedwork aims to facilitate improved system performance when QoE parametersare jointly optimized. Some experiments were carried out and compared withexisting work to prove the performance of the proposed model. Last but notleast, the experiments implementation of our proposed work provides highuser’s QoE than its competitors.
基金Fundamental Research Funds for the Central Universities(Grant No.FRF-TP-19-006A3).
文摘As a common and high-risk type of disease,heart disease seriously threatens people’s health.At the same time,in the era of the Internet of Thing(IoT),smart medical device has strong practical significance for medical workers and patients because of its ability to assist in the diagnosis of diseases.Therefore,the research of real-time diagnosis and classification algorithms for arrhythmia can help to improve the diagnostic efficiency of diseases.In this paper,we design an automatic arrhythmia classification algorithm model based on Convolutional Neural Network(CNN)and Encoder-Decoder model.The model uses Long Short-Term Memory(LSTM)to consider the influence of time series features on classification results.Simultaneously,it is trained and tested by the MIT-BIH arrhythmia database.Besides,Generative Adversarial Networks(GAN)is adopted as a method of data equalization for solving data imbalance problem.The simulation results show that for the inter-patient arrhythmia classification,the hybrid model combining CNN and Encoder-Decoder model has the best classification accuracy,of which the accuracy can reach 94.05%.Especially,it has a better advantage for the classification effect of supraventricular ectopic beats(class S)and fusion beats(class F).
基金supported by the National Natural Science Foundation of China(No.41906169)the PLA Academy of Military Sciences.
文摘Noise reduction analysis of signals is essential for modern underwater acoustic detection systems.The traditional noise reduction techniques gradually lose efficacy because the target signal is masked by biological and natural noise in the marine environ-ment.The feature extraction method combining time-frequency spectrograms and deep learning can effectively achieve the separation of noise and target signals.A fully convolutional encoder-decoder neural network(FCEDN)is proposed to address the issue of noise reduc-tion in underwater acoustic signals.The time-domain waveform map of underwater acoustic signals is converted into a wavelet low-frequency analysis recording spectrogram during the denoising process to preserve as many underwater acoustic signal characteristics as possible.The FCEDN is built to learn the spectrogram mapping between noise and target signals that can be learned at each time level.The transposed convolution transforms are introduced,which can transform the spectrogram features of the signals into listenable audio files.After evaluating the systems on the ShipsEar Dataset,the proposed method can increase SNR and SI-SNR by 10.02 and 9.5dB,re-spectively.
文摘Tunnel boring machines(TBMs)have been widely utilised in tunnel construction due to their high efficiency and reliability.Accurately predicting TBM performance can improve project time management,cost control,and risk management.This study aims to use deep learning to develop real-time models for predicting the penetration rate(PR).The models are built using data from the Changsha metro project,and their performances are evaluated using unseen data from the Zhengzhou Metro project.In one-step forecast,the predicted penetration rate follows the trend of the measured penetration rate in both training and testing.The autoregressive integrated moving average(ARIMA)model is compared with the recurrent neural network(RNN)model.The results show that univariate models,which only consider historical penetration rate itself,perform better than multivariate models that take into account multiple geological and operational parameters(GEO and OP).Next,an RNN variant combining time series of penetration rate with the last-step geological and operational parameters is developed,and it performs better than other models.A sensitivity analysis shows that the penetration rate is the most important parameter,while other parameters have a smaller impact on time series forecasting.It is also found that smoothed data are easier to predict with high accuracy.Nevertheless,over-simplified data can lose real characteristics in time series.In conclusion,the RNN variant can accurately predict the next-step penetration rate,and data smoothing is crucial in time series forecasting.This study provides practical guidance for TBM performance forecasting in practical engineering.