Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of ...As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of faces is a challenging process.This paper proposes a face age estimation algorithm based on lightweight convolutional neural network in view of the complexity of the environment and the limitations of device computing ability.Improving face age estimation based on Soft Stagewise Regression Network(SSR-Net)and facial images,this paper employs the Center Symmetric Local Binary Pattern(CSLBP)method to obtain the feature image and then combines the face image and the feature image as network input data.Adding feature images to the convolutional neural network can improve the accuracy as well as increase the network model robustness.The experimental results on IMDB-WIKI and MORPH 2 datasets show that the lightweight convolutional neural network method proposed in this paper reduces model complexity and increases the accuracy of face age estimations.展开更多
In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use...In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use of computer vision methods.In this study,a novel 2D grapevine winter pruning location detection method was proposed for automatic winter pruning with a Y-shaped cultivation system.The method can be divided into the following four steps.First,the vineyard image was segmented by the threshold two times Red minus Green minus Blue(2R−G−B)channel and S channel;Second,extract the grapevine skeleton by Improved Enhanced Parallel Thinning Algorithm(IEPTA);Third,find the structure of each grapevine by judging the angle and distance relationship between branches;Fourth,obtain the bounding boxes from these grapevines,then pre-trained MobileNetV3_small×0.75 was utilized to classify each bounding box and finally find the pruning location.According to the detection experiment result,the method of this study achieved a precision of 98.8%and a recall of 92.3%for bud detection,an accuracy of 83.4%for pruning location detection,and a total time of 0.423 s.Therefore,the results indicated that the proposed 2D pruning location detection method had decent robustness as well as high precision that could guide automatic devices to winter prune efficiently.展开更多
Automated recognition of insect category,which currently is performed mainly by agriculture experts,is a challenging problem that has received increasing attention in recent years.The goal of the present research is t...Automated recognition of insect category,which currently is performed mainly by agriculture experts,is a challenging problem that has received increasing attention in recent years.The goal of the present research is to develop an intelligent mobile-terminal recognition system based on deep neural networks to recognize garden insects in a device that can be conveniently deployed in mobile terminals.State-of-the-art lightweight convolutional neural networks(such as SqueezeNet and ShuffleNet)have the same accuracy as classical convolutional neural networks such as AlexNet but fewer parameters,thereby not only requiring communication across servers during distributed training but also being more feasible to deploy on mobile terminals and other hardware with limited memory.In this research,we connect with the rich details of the low-level network features and the rich semantic information of the high-level network features to construct more rich semantic information feature maps which can effectively improve SqueezeNet model with a small computational cost.In addition,we developed an off-line insect recognition software that can be deployed on the mobile terminal to solve no network and the timedelay problems in the field.Experiments demonstrate that the proposed method is promising for recognition while remaining within a limited computational budget and delivers a much higher recognition accuracy of 91.64%with less training time relative to other classical convolutional neural networks.We have also verified the results that the improved SqueezeNet model has a 2.3%higher than of the original model in the open insect data IP102.展开更多
In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of ...In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of pattern recognition.The research and development of high-efficiency,highquality and low-cost automatic identification methods for rice diseases that can replace humans is an important means of dealing with the current situation from a technical perspective.This paper mainly focuses on the problem of huge parameters of the Convolutional Neural Network(CNN)model and proposes a recognitionmodel that combines amulti-scale convolution module with a neural network model based on Visual Geometry Group(VGG).The accuracy and loss of the training set and the test set are used to evaluate the performance of the model.The test accuracy of this model is 97.1%that has increased 5.87%over VGG.Furthermore,the memory requirement is 26.1M,only 1.6%of the VGG.Experiment results show that this model performs better in terms of accuracy,recognition speed and memory size.展开更多
The field of finance heavily relies on cybersecurity to safeguard its systems and clients from harmful software.The identification of malevolent code within financial software is vital for protecting both the financia...The field of finance heavily relies on cybersecurity to safeguard its systems and clients from harmful software.The identification of malevolent code within financial software is vital for protecting both the financial system and individual clients.Nevertheless,present detection models encounter limitations in their ability to identify malevolent code and its variations,all while encompassing a multitude of parameters.To overcome these obsta-cles,we introduce a lean model for classifying families of malevolent code,formulated on Ghost-DenseNet-SE.This model integrates the Ghost module,DenseNet,and the squeeze-and-excitation(SE)channel domain attention mechanism.It substitutes the standard convolutional layer in DenseNet with the Ghost module,thereby diminishing the model’s size and augmenting recognition speed.Additionally,the channel domain attention mechanism assigns distinctive weights to feature channels,facilitating the extraction of pivotal characteristics of malevolent code and bolstering detection precision.Experimental outcomes on the Malimg dataset indicate that the model attained an accuracy of 99.14%in discerning families of malevolent code,surpassing AlexNet(97.8%)and The visual geometry group network(VGGNet)(96.16%).The proposed model exhibits reduced parameters,leading to decreased model complexity alongside enhanced classification accuracy,rendering it a valuable asset for categorizing malevolent code.展开更多
As the field of autonomous driving evolves, real-time semantic segmentation has become a crucial part of computer vision tasks. However, most existing methods use lightweight convolution to reduce the computational ef...As the field of autonomous driving evolves, real-time semantic segmentation has become a crucial part of computer vision tasks. However, most existing methods use lightweight convolution to reduce the computational effort, resulting in lower accuracy. To address this problem, we construct TBANet, a network with an encoder-decoder structure for efficient feature extraction. In the encoder part, the TBA module is designed to extract details and the ETBA module is used to learn semantic representations in a high-dimensional space. In the decoder part, we design a combination of multiple upsampling methods to aggregate features with less computational overhead. We validate the efficiency of TBANet on the Cityscapes dataset. It achieves 75.1% mean Intersection over Union(mIoU) with only 2.07 million parameters and can reach 90.3 Frames Per Second(FPS).展开更多
To address the issues of slow diagnostic speed,low accuracy,and poor generalization performance in traditional rolling bearing fault diagnosis methods,we propose a rolling bearing fault diagnosis method based on Marko...To address the issues of slow diagnostic speed,low accuracy,and poor generalization performance in traditional rolling bearing fault diagnosis methods,we propose a rolling bearing fault diagnosis method based on Markov Transition Field(MTF)image encoding combined with a lightweight convolutional neural network that integrates a Convolutional Block Attention Module(CBAM-LCNN).Specifically,we first use the Markov Transition Field to convert the original one-dimensional vibration signals of rolling bearings into two-dimensional images.Then,we construct a lightweight convolutional neural network incorporating the convolutional attention module(CBAM-LCNN).Finally,the two-dimensional images obtained from MTF mapping are fed into the CBAM-LCNN network for image feature extraction and fault diagnosis.We validate the effectiveness of the proposed method on the bearing fault datasets from Guangdong University of Petrochemical Technology’s multi-stage centrifugal fan and Case Western Reserve University.Experimental results show that,compared to other advanced baseline methods,the proposed rolling bearing fault diagnosis method offers faster diagnostic speed and higher diagnostic accuracy.In addition,we conducted experiments on the Xi’an Jiaotong University rolling bearing dataset,achieving excellent results in bearing fault diagnosis.These results validate the strong generalization performance of the proposed method.The method presented in this paper not only effectively diagnoses faults in rolling bearings but also serves as a reference for fault diagnosis in other equipment.展开更多
Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effe...Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effective video summarization framework based on shot classification and replay detection for field sports videos.Accurate shot classification is mandatory to better structure the input video for further processing,i.e.,key events or replay detection.Therefore,we present a lightweight convolutional neural network based method for shot classification.Then we analyze each shot for replay detection and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos.For this purpose,we propose local octa-pattern features to represent video frames and train the extreme learning machine for classification as replay or non-replay frames.The proposed framework is robust to variations in cameras,replay speed,shot speed,illumination conditions,game structure,sports genre,broadcasters,logo designs and placement,frame transitions,and editing effects.The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer,baseball,and cricket.Experimental results demonstrate that the proposed framework can reliably be used for shot classification and replay detection to summarize field sports videos.展开更多
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金This work was funded by the foundation of Liaoning Educational committee under the Grant No.2019LNJC03.
文摘As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of faces is a challenging process.This paper proposes a face age estimation algorithm based on lightweight convolutional neural network in view of the complexity of the environment and the limitations of device computing ability.Improving face age estimation based on Soft Stagewise Regression Network(SSR-Net)and facial images,this paper employs the Center Symmetric Local Binary Pattern(CSLBP)method to obtain the feature image and then combines the face image and the feature image as network input data.Adding feature images to the convolutional neural network can improve the accuracy as well as increase the network model robustness.The experimental results on IMDB-WIKI and MORPH 2 datasets show that the lightweight convolutional neural network method proposed in this paper reduces model complexity and increases the accuracy of face age estimations.
基金This work was financially supported by the Basic Public Welfare Research Project of Zhejiang Province(Grant No.LGN20E050007).
文摘In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use of computer vision methods.In this study,a novel 2D grapevine winter pruning location detection method was proposed for automatic winter pruning with a Y-shaped cultivation system.The method can be divided into the following four steps.First,the vineyard image was segmented by the threshold two times Red minus Green minus Blue(2R−G−B)channel and S channel;Second,extract the grapevine skeleton by Improved Enhanced Parallel Thinning Algorithm(IEPTA);Third,find the structure of each grapevine by judging the angle and distance relationship between branches;Fourth,obtain the bounding boxes from these grapevines,then pre-trained MobileNetV3_small×0.75 was utilized to classify each bounding box and finally find the pruning location.According to the detection experiment result,the method of this study achieved a precision of 98.8%and a recall of 92.3%for bud detection,an accuracy of 83.4%for pruning location detection,and a total time of 0.423 s.Therefore,the results indicated that the proposed 2D pruning location detection method had decent robustness as well as high precision that could guide automatic devices to winter prune efficiently.
基金National Natural Science Foundation of China(Grand No:61601034)National Natural Science of China(Grand No:31871525)Promotion and Innovation of Beijing Academy of Agriculture and Forestry Sciences.
文摘Automated recognition of insect category,which currently is performed mainly by agriculture experts,is a challenging problem that has received increasing attention in recent years.The goal of the present research is to develop an intelligent mobile-terminal recognition system based on deep neural networks to recognize garden insects in a device that can be conveniently deployed in mobile terminals.State-of-the-art lightweight convolutional neural networks(such as SqueezeNet and ShuffleNet)have the same accuracy as classical convolutional neural networks such as AlexNet but fewer parameters,thereby not only requiring communication across servers during distributed training but also being more feasible to deploy on mobile terminals and other hardware with limited memory.In this research,we connect with the rich details of the low-level network features and the rich semantic information of the high-level network features to construct more rich semantic information feature maps which can effectively improve SqueezeNet model with a small computational cost.In addition,we developed an off-line insect recognition software that can be deployed on the mobile terminal to solve no network and the timedelay problems in the field.Experiments demonstrate that the proposed method is promising for recognition while remaining within a limited computational budget and delivers a much higher recognition accuracy of 91.64%with less training time relative to other classical convolutional neural networks.We have also verified the results that the improved SqueezeNet model has a 2.3%higher than of the original model in the open insect data IP102.
基金supported by National key research and development program sub-topics[2018YFF0213606-03(Mu Y.,Hu T.L.,Gong H.,Li S.J.and Sun Y.H.)http://www.most.gov.cn]Jilin Province Science and Technology Development Plan focuses on research and development projects[20200402006NC(Mu Y.,Hu T.L.,Gong H.and Li S.J.)http://kjt.jl.gov.cn]+1 种基金Science and technology support project for key industries in southern Xinjiang[2018DB001(Gong H.,and Li S.J.)http://kjj.xjbt.gov.cn]Key technology R&D project of Changchun Science and Technology Bureau of Jilin Province[21ZGN29(Mu Y.,Bao H.P.,Wang X.B.)http://kjj.changchun.gov.cn].
文摘In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of pattern recognition.The research and development of high-efficiency,highquality and low-cost automatic identification methods for rice diseases that can replace humans is an important means of dealing with the current situation from a technical perspective.This paper mainly focuses on the problem of huge parameters of the Convolutional Neural Network(CNN)model and proposes a recognitionmodel that combines amulti-scale convolution module with a neural network model based on Visual Geometry Group(VGG).The accuracy and loss of the training set and the test set are used to evaluate the performance of the model.The test accuracy of this model is 97.1%that has increased 5.87%over VGG.Furthermore,the memory requirement is 26.1M,only 1.6%of the VGG.Experiment results show that this model performs better in terms of accuracy,recognition speed and memory size.
基金funded by National Natural Science Foundation of China(under Grant No.61905201)。
文摘The field of finance heavily relies on cybersecurity to safeguard its systems and clients from harmful software.The identification of malevolent code within financial software is vital for protecting both the financial system and individual clients.Nevertheless,present detection models encounter limitations in their ability to identify malevolent code and its variations,all while encompassing a multitude of parameters.To overcome these obsta-cles,we introduce a lean model for classifying families of malevolent code,formulated on Ghost-DenseNet-SE.This model integrates the Ghost module,DenseNet,and the squeeze-and-excitation(SE)channel domain attention mechanism.It substitutes the standard convolutional layer in DenseNet with the Ghost module,thereby diminishing the model’s size and augmenting recognition speed.Additionally,the channel domain attention mechanism assigns distinctive weights to feature channels,facilitating the extraction of pivotal characteristics of malevolent code and bolstering detection precision.Experimental outcomes on the Malimg dataset indicate that the model attained an accuracy of 99.14%in discerning families of malevolent code,surpassing AlexNet(97.8%)and The visual geometry group network(VGGNet)(96.16%).The proposed model exhibits reduced parameters,leading to decreased model complexity alongside enhanced classification accuracy,rendering it a valuable asset for categorizing malevolent code.
文摘As the field of autonomous driving evolves, real-time semantic segmentation has become a crucial part of computer vision tasks. However, most existing methods use lightweight convolution to reduce the computational effort, resulting in lower accuracy. To address this problem, we construct TBANet, a network with an encoder-decoder structure for efficient feature extraction. In the encoder part, the TBA module is designed to extract details and the ETBA module is used to learn semantic representations in a high-dimensional space. In the decoder part, we design a combination of multiple upsampling methods to aggregate features with less computational overhead. We validate the efficiency of TBANet on the Cityscapes dataset. It achieves 75.1% mean Intersection over Union(mIoU) with only 2.07 million parameters and can reach 90.3 Frames Per Second(FPS).
基金supported by the National Natural Science Foundation of China(52001340)the Henan Province Science and Technology Key Research Project(242102110332)the Henan Province Teaching Reform Project(2022SYJXLX087).
文摘To address the issues of slow diagnostic speed,low accuracy,and poor generalization performance in traditional rolling bearing fault diagnosis methods,we propose a rolling bearing fault diagnosis method based on Markov Transition Field(MTF)image encoding combined with a lightweight convolutional neural network that integrates a Convolutional Block Attention Module(CBAM-LCNN).Specifically,we first use the Markov Transition Field to convert the original one-dimensional vibration signals of rolling bearings into two-dimensional images.Then,we construct a lightweight convolutional neural network incorporating the convolutional attention module(CBAM-LCNN).Finally,the two-dimensional images obtained from MTF mapping are fed into the CBAM-LCNN network for image feature extraction and fault diagnosis.We validate the effectiveness of the proposed method on the bearing fault datasets from Guangdong University of Petrochemical Technology’s multi-stage centrifugal fan and Case Western Reserve University.Experimental results show that,compared to other advanced baseline methods,the proposed rolling bearing fault diagnosis method offers faster diagnostic speed and higher diagnostic accuracy.In addition,we conducted experiments on the Xi’an Jiaotong University rolling bearing dataset,achieving excellent results in bearing fault diagnosis.These results validate the strong generalization performance of the proposed method.The method presented in this paper not only effectively diagnoses faults in rolling bearings but also serves as a reference for fault diagnosis in other equipment.
基金Project supported by the Directorate of Advanced Studies,Research&Technological Development,University of Engineering and Technology Taxila(No.UET/ASRTD/RG-1002-3)。
文摘Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effective video summarization framework based on shot classification and replay detection for field sports videos.Accurate shot classification is mandatory to better structure the input video for further processing,i.e.,key events or replay detection.Therefore,we present a lightweight convolutional neural network based method for shot classification.Then we analyze each shot for replay detection and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos.For this purpose,we propose local octa-pattern features to represent video frames and train the extreme learning machine for classification as replay or non-replay frames.The proposed framework is robust to variations in cameras,replay speed,shot speed,illumination conditions,game structure,sports genre,broadcasters,logo designs and placement,frame transitions,and editing effects.The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer,baseball,and cricket.Experimental results demonstrate that the proposed framework can reliably be used for shot classification and replay detection to summarize field sports videos.