Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep...Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field.展开更多
Purpose: To detect small diagnostic signals such as lung nodules in chest radiographs, radiologists magnify a region-of-interest using linear interpolation methods. However, such methods tend to generate over-smoothed...Purpose: To detect small diagnostic signals such as lung nodules in chest radiographs, radiologists magnify a region-of-interest using linear interpolation methods. However, such methods tend to generate over-smoothed images with artifacts that can make interpretation difficult. The purpose of this study was to investigate the effectiveness of super-resolution methods for improving the image quality of magnified chest radiographs. Materials and Methods: A total of 247 chest X-rays were sampled from the JSRT database, then divided into 93 training cases with non-nodules and 154 test cases with lung nodules. We first trained two types of super-resolution methods, sparse-coding super-resolution (ScSR) and super-resolution convolutional neural network (SRCNN). With the trained super-resolution methods, the high-resolution image was then reconstructed using the super-resolution methods from a low-resolution image that was down-sampled from the original test image. We compared the image quality of the super-resolution methods and the linear interpolations (nearest neighbor and bilinear interpolations). For quantitative evaluation, we measured two image quality metrics: peak signal-to-noise ratio (PSNR) and structural similarity (SSIM). For comparative evaluation of the super-resolution methods, we measured the computation time per image. Results: The PSNRs and SSIMs for the ScSR and the SRCNN schemes were significantly higher than those of the linear interpolation methods (p p p Conclusion: Super-resolution methods provide significantly better image quality than linear interpolation methods for magnified chest radiograph images. Of the two tested schemes, the SRCNN scheme processed the images fastest;thus, SRCNN could be clinically superior for processing radiographs in terms of both image quality and processing speed.展开更多
The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral b...The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral bands,enabling the detection of highly variable aerosol optical depth(AOD).Quantitative retrieval of AOD has hitherto been challenging,especially over land.In this study,an AOD retrieval algorithm is proposed that combines deep learning and transfer learning.The algorithm uses core concepts from both the Dark Target(DT)and Deep Blue(DB)algorithms to select features for the machinelearning(ML)algorithm,allowing for AOD retrieval at 550 nm over both dark and bright surfaces.The algorithm consists of two steps:①A baseline deep neural network(DNN)with skip connections is developed using 10 min Advanced Himawari Imager(AHI)AODs as the target variable,and②sunphotometer AODs from 89 ground-based stations are used to fine-tune the DNN parameters.Out-of-station validation shows that the retrieved AOD attains high accuracy,characterized by a coefficient of determination(R2)of 0.70,a mean bias error(MBE)of 0.03,and a percentage of data within the expected error(EE)of 70.7%.A sensitivity study reveals that the top-of-atmosphere reflectance at 650 and 470 nm,as well as the surface reflectance at 650 nm,are the two largest sources of uncertainty impacting the retrieval.In a case study of monitoring an extreme aerosol event,the AGRI AOD is found to be able to capture the detailed temporal evolution of the event.This work demonstrates the superiority of the transfer-learning technique in satellite AOD retrievals and the applicability of the retrieved AGRI AOD in monitoring extreme pollution events.展开更多
Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scal...Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scales.A cul-tural heritage image is one of thefine-grained images because each image has the same similarity in most cases.Using the classification technique,distinguishing cultural heritage architecture may be difficult.This study proposes a cultural heri-tage content retrieval method using adaptive deep learning forfine-grained image retrieval.The key contribution of this research was the creation of a retrieval mod-el that could handle incremental streams of new categories while maintaining its past performance in old categories and not losing the old categorization of a cul-tural heritage image.The goal of the proposed method is to perform a retrieval task for classes.Incremental learning for new classes was conducted to reduce the re-training process.In this step,the original class is not necessary for re-train-ing which we call an adaptive deep learning technique.Cultural heritage in the case of Thai archaeological site architecture was retrieved through machine learn-ing and image processing.We analyze the experimental results of incremental learning forfine-grained images with images of Thai archaeological site architec-ture from world heritage provinces in Thailand,which have a similar architecture.Using afine-grained image retrieval technique for this group of cultural heritage images in a database can solve the problem of a high degree of similarity among categories and a high degree of dissimilarity for a specific category.The proposed method for retrieving the correct image from a database can deliver an average accuracy of 85 percent.Adaptive deep learning forfine-grained image retrieval was used to retrieve cultural heritage content,and it outperformed state-of-the-art methods infine-grained image retrieval.展开更多
Recent days,Image retrieval has become a tedious process as the image database has grown very larger.The introduction of Machine Learning(ML)and Deep Learning(DL)made this process more comfortable.In these,the pair-wi...Recent days,Image retrieval has become a tedious process as the image database has grown very larger.The introduction of Machine Learning(ML)and Deep Learning(DL)made this process more comfortable.In these,the pair-wise label similarity is used tofind the matching images from the database.But this method lacks of limited propose code and weak execution of misclassified images.In order to get-rid of the above problem,a novel triplet based label that incorporates context-spatial similarity measure is proposed.A Point Attention Based Triplet Network(PABTN)is introduced to study propose code that gives maximum discriminative ability.To improve the performance of ranking,a corre-lating resolutions for the classification,triplet labels based onfindings,a spatial-attention mechanism and Region Of Interest(ROI)and small trial information loss containing a new triplet cross-entropy loss are used.From the experimental results,it is shown that the proposed technique exhibits better results in terms of mean Reciprocal Rank(mRR)and mean Average Precision(mAP)in the CIFAR-10 and NUS-WIPE datasets.展开更多
The use of massive image databases has increased drastically over the few years due to evolution of multimedia technology.Image retrieval has become one of the vital tools in image processing applications.Content-Base...The use of massive image databases has increased drastically over the few years due to evolution of multimedia technology.Image retrieval has become one of the vital tools in image processing applications.Content-Based Image Retrieval(CBIR)has been widely used in varied applications.But,the results produced by the usage of a single image feature are not satisfactory.So,multiple image features are used very often for attaining better results.But,fast and effective searching for relevant images from a database becomes a challenging task.In the previous existing system,the CBIR has used the combined feature extraction technique using color auto-correlogram,Rotation-Invariant Uniform Local Binary Patterns(RULBP)and local energy.However,the existing system does not provide significant results in terms of recall and precision.Also,the computational complexity is higher for the existing CBIR systems.In order to handle the above mentioned issues,the Gray Level Co-occurrence Matrix(GLCM)with Deep Learning based Enhanced Convolution Neural Network(DLECNN)is proposed in this work.The proposed system framework includes noise reduction using histogram equalization,feature extraction using GLCM,similarity matching computation using Hierarchal and Fuzzy c-Means(HFCM)algorithm and the image retrieval using DLECNN algorithm.The histogram equalization has been used for computing the image enhancement.This enhanced image has a uniform histogram.Then,the GLCM method has been used to extract the features such as shape,texture,colour,annotations and keywords.The HFCM similarity measure is used for computing the query image vector's similarity index with every database images.For enhancing the performance of this image retrieval approach,the DLECNN algorithm is proposed to retrieve more accurate features of the image.The proposed GLCM+DLECNN algorithm provides better results associated with high accuracy,precision,recall,f-measure and lesser complexity.From the experimental results,it is clearly observed that the proposed system provides efficient image retrieval for the given query image.展开更多
Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of eff...Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of efforts trying to automate the classification operation and retrieve similar images accurately.To reach this goal,we developed a VGG19 deep convolutional neural network to extract the visual features from the images automatically.Then,the distances among the extracted features vectors are measured and a similarity score is generated using a Siamese deep neural network.The Siamese model built and trained at first from scratch but,it didn’t generated high evaluation metrices.Thus,we re-built it from VGG19 pre-trained deep learning model to generate higher evaluation metrices.Afterward,three different distance metrics combined with the Sigmoid activation function are experimented looking for the most accurate method formeasuring the similarities among the retrieved images.Reaching that the highest evaluation parameters generated using the Cosine distance metric.Moreover,the Graphics Processing Unit(GPU)utilized to run the code instead of running it on the Central Processing Unit(CPU).This step optimized the execution further since it expedited both the training and the retrieval time efficiently.After extensive experimentation,we reached satisfactory solution recording 0.98 and 0.99 F-score for the classification and for the retrieval,respectively.展开更多
Apricot has a long history of cultivation and has many varieties and types. The traditional variety identification methods are timeconsuming and labor-consuming, posing grand challenges to apricot resource management....Apricot has a long history of cultivation and has many varieties and types. The traditional variety identification methods are timeconsuming and labor-consuming, posing grand challenges to apricot resource management. Tool development in this regard will help researchers quickly identify variety information. This study photographed apricot fruits outdoors and indoors and constructed a dataset that can precisely classify the fruits using a U-net model (F-score:99%), which helps to obtain the fruit's size, shape, and color features. Meanwhile, a variety search engine was constructed, which can search and identify variety from the database according to the above features. Besides, a mobile and web application (ApricotView) was developed, and the construction mode can be also applied to other varieties of fruit trees.Additionally, we have collected four difficult-to-identify seed datasets and used the VGG16 model for training, with an accuracy of 97%, which provided an important basis for ApricotView. To address the difficulties in data collection bottlenecking apricot phenomics research, we developed the first apricot database platform of its kind (ApricotDIAP, http://apricotdiap.com/) to accumulate, manage, and publicize scientific data of apricot.展开更多
Given one specific image,it would be quite significant if humanity could simply retrieve all those pictures that fall into a similar category of images.However,traditional methods are inclined to achieve high-quality ...Given one specific image,it would be quite significant if humanity could simply retrieve all those pictures that fall into a similar category of images.However,traditional methods are inclined to achieve high-quality retrieval by utilizing adequate learning instances,ignoring the extraction of the image’s essential information which leads to difficulty in the retrieval of similar category images just using one reference image.Aiming to solve this problem above,we proposed in this paper one refined sparse representation based similar category image retrieval model.On the one hand,saliency detection and multi-level decomposition could contribute to taking salient and spatial information into consideration more fully in the future.On the other hand,the cross mutual sparse coding model aims to extract the image’s essential feature to the maximumextent possible.At last,we set up a database concluding a large number of multi-source images.Adequate groups of comparative experiments show that our method could contribute to retrieving similar category images effectively.Moreover,adequate groups of ablation experiments show that nearly all procedures play their roles,respectively.展开更多
The exponential increase in data over the past fewyears,particularly in images,has led to more complex content since visual representation became the new norm.E-commerce and similar platforms maintain large image cata...The exponential increase in data over the past fewyears,particularly in images,has led to more complex content since visual representation became the new norm.E-commerce and similar platforms maintain large image catalogues of their products.In image databases,searching and retrieving similar images is still a challenge,even though several image retrieval techniques have been proposed over the decade.Most of these techniques work well when querying general image databases.However,they often fail in domain-specific image databases,especially for datasets with low intraclass variance.This paper proposes a domain-specific image similarity search engine based on a fused deep learning network.The network is comprised of an improved object localization module,a classification module to narrow down search options and finally a feature extraction and similarity calculation module.The network features both an offline stage for indexing the dataset and an online stage for querying.The dataset used to evaluate the performance of the proposed network is a custom domain-specific dataset related to cosmetics packaging gathered from various online platforms.The proposed method addresses the intraclass variance problem with more precise object localization and the introduction of top result reranking based on object contours.Finally,quantitative and qualitative experiment results are presented,showing improved image similarity search performance.展开更多
The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models...The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models such as speech understanding,emotion detection,home automation,and so on.If an image needs to be captioned,then the objects in that image,its actions and connections,and any silent feature that remains under-projected or missing from the images should be identified.The aim of the image captioning process is to generate a caption for image.In next step,the image should be provided with one of the most significant and detailed descriptions that is syntactically as well as semantically correct.In this scenario,computer vision model is used to identify the objects and NLP approaches are followed to describe the image.The current study develops aNatural Language Processing with Optimal Deep Learning Enabled Intelligent Image Captioning System(NLPODL-IICS).The aim of the presented NLPODL-IICS model is to produce a proper description for input image.To attain this,the proposed NLPODL-IICS follows two stages such as encoding and decoding processes.Initially,at the encoding side,the proposed NLPODL-IICS model makes use of Hunger Games Search(HGS)with Neural Search Architecture Network(NASNet)model.This model represents the input data appropriately by inserting it into a predefined length vector.Besides,during decoding phase,Chimp Optimization Algorithm(COA)with deeper Long Short Term Memory(LSTM)approach is followed to concatenate the description sentences 4436 CMC,2023,vol.74,no.2 produced by the method.The application of HGS and COA algorithms helps in accomplishing proper parameter tuning for NASNet and LSTM models respectively.The proposed NLPODL-IICS model was experimentally validated with the help of two benchmark datasets.Awidespread comparative analysis confirmed the superior performance of NLPODL-IICS model over other models.展开更多
(Aim)COVID-19 is an ongoing infectious disease.It has caused more than 107.45 m confirmed cases and 2.35 m deaths till 11/Feb/2021.Traditional computer vision methods have achieved promising results on the automatic s...(Aim)COVID-19 is an ongoing infectious disease.It has caused more than 107.45 m confirmed cases and 2.35 m deaths till 11/Feb/2021.Traditional computer vision methods have achieved promising results on the automatic smart diagnosis.(Method)This study aims to propose a novel deep learning method that can obtain better performance.We use the pseudo-Zernike moment(PZM),derived from Zernike moment,as the extracted features.Two settings are introducing:(i)image plane over unit circle;and(ii)image plane inside the unit circle.Afterward,we use a deep-stacked sparse autoencoder(DSSAE)as the classifier.Besides,multiple-way data augmentation is chosen to overcome overfitting.The multiple-way data augmentation is based on Gaussian noise,salt-and-pepper noise,speckle noise,horizontal and vertical shear,rotation,Gamma correction,random translation and scaling.(Results)10 runs of 10-fold cross validation shows that our PZM-DSSAE method achieves a sensitivity of 92.06%±1.54%,a specificity of 92.56%±1.06%,a precision of 92.53%±1.03%,and an accuracy of 92.31%±1.08%.Its F1 score,MCC,and FMI arrive at 92.29%±1.10%,84.64%±2.15%,and 92.29%±1.10%,respectively.The AUC of our model is 0.9576.(Conclusion)We demonstrate“image plane over unit circle”can get better results than“image plane inside a unit circle.”Besides,this proposed PZM-DSSAE model is better than eight state-of-the-art approaches.展开更多
In this paper, a discriminative structured dictionary learning algorithm is presented. To enhance the dictionary's discriminative power, the reconstruction error, classification error and inhomogeneous representat...In this paper, a discriminative structured dictionary learning algorithm is presented. To enhance the dictionary's discriminative power, the reconstruction error, classification error and inhomogeneous representation error are integrated into the objective function. The proposed approach learns a single structured dictionary and a linear classifier jointly. The learned dictionary encourages the samples from the same class to have similar sparse codes, and the samples from different classes to have dissimilar sparse codes. The solution to the objective function is achieved by employing a feature-sign search algorithm and Lagrange dual method. Experimental results on three public databases demonstrate that the proposed approach outperforms several recently proposed dictionary learning techniques for classification.展开更多
In this paper, a two-level Bregman method is presented with graph regularized sparse coding for highly undersampled magnetic resonance image reconstruction. The graph regularized sparse coding is incorporated with the...In this paper, a two-level Bregman method is presented with graph regularized sparse coding for highly undersampled magnetic resonance image reconstruction. The graph regularized sparse coding is incorporated with the two-level Bregman iterative procedure which enforces the sampled data constraints in the outer level and updates dictionary and sparse representation in the inner level. Graph regularized sparse coding and simple dictionary updating applied in the inner minimization make the proposed algorithm converge with a relatively small number of iterations. Experimental results demonstrate that the proposed algorithm can consistently reconstruct both simulated MR images and real MR data efficiently, and outperforms the current state-of-the-art approaches in terms of visual comparisons and quantitative measures.展开更多
Over recent years, Convolutional Neural Networks (CNN) has improved performance on practically every image-based task, including Content-Based Image Retrieval (CBIR). Nevertheless, since features of CNN have altered o...Over recent years, Convolutional Neural Networks (CNN) has improved performance on practically every image-based task, including Content-Based Image Retrieval (CBIR). Nevertheless, since features of CNN have altered orientation, training a CBIR system to detect and correct the angle is complex. While it is possible to construct rotation-invariant features by hand, retrieval accuracy will be low because hand engineering only creates low-level features, while deep learning methods build high-level and low-level features simultaneously. This paper presents a novel approach that combines a deep learning orientation angle detection model with the CBIR feature extraction model to correct the rotation angle of any image. This offers a unique construction of a rotation-invariant CBIR system that handles the CNN features that are not rotation invariant. This research also proposes a further study on how a rotation-invariant deep CBIR can recover images from the dataset in real-time. The final results of this system show significant improvement as compared to a default CNN feature extraction model without the OAD.展开更多
The encoding aperture snapshot spectral imaging system,based on the compressive sensing theory,can be regarded as an encoder,which can efficiently obtain compressed two-dimensional spectral data and then decode it int...The encoding aperture snapshot spectral imaging system,based on the compressive sensing theory,can be regarded as an encoder,which can efficiently obtain compressed two-dimensional spectral data and then decode it into three-dimensional spectral data through deep neural networks.However,training the deep neural net⁃works requires a large amount of clean data that is difficult to obtain.To address the problem of insufficient training data for deep neural networks,a self-supervised hyperspectral denoising neural network based on neighbor⁃hood sampling is proposed.This network is integrated into a deep plug-and-play framework to achieve self-supervised spectral reconstruction.The study also examines the impact of different noise degradation models on the fi⁃nal reconstruction quality.Experimental results demonstrate that the self-supervised learning method enhances the average peak signal-to-noise ratio by 1.18 dB and improves the structural similarity by 0.009 compared with the supervised learning method.Additionally,it achieves better visual reconstruction results.展开更多
Deep learning has transformed computational imaging,but traditional pixel-based representations limit their ability to capture continuous multiscale object features.Addressing this gap,we introduce a local conditional...Deep learning has transformed computational imaging,but traditional pixel-based representations limit their ability to capture continuous multiscale object features.Addressing this gap,we introduce a local conditional neural field(LCNF)framework,which leverages a continuous neural representation to provide flexible object representations.LCNF’s unique capabilities are demonstrated in solving the highly ill-posed phase retrieval problem of multiplexed Fourier ptychographic microscopy.Our network,termed neural phase retrieval(NeuPh),enables continuous-domain resolution-enhanced phase reconstruction,offering scalability,robustness,accuracy,and generalizability that outperform existing methods.NeuPh integrates a local conditional neural representation and a coordinate-based training strategy.We show that NeuPh can accurately reconstruct high-resolution phase images from low-resolution intensity measurements.Furthermore,NeuPh consistently applies continuous object priors and effectively eliminates various phase artifacts,demonstrating robustness even when trained on imperfect datasets.Moreover,NeuPh improves accuracy and generalization compared with existing deep learning models.We further investigate a hybrid training strategy combining both experimental and simulated datasets,elucidating the impact of domain shift between experiment and simulation.Our work underscores the potential of the LCNF framework in solving complex large-scale inverse problems,opening up new possibilities for deep-learning-based imaging techniques.展开更多
文摘Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field.
文摘Purpose: To detect small diagnostic signals such as lung nodules in chest radiographs, radiologists magnify a region-of-interest using linear interpolation methods. However, such methods tend to generate over-smoothed images with artifacts that can make interpretation difficult. The purpose of this study was to investigate the effectiveness of super-resolution methods for improving the image quality of magnified chest radiographs. Materials and Methods: A total of 247 chest X-rays were sampled from the JSRT database, then divided into 93 training cases with non-nodules and 154 test cases with lung nodules. We first trained two types of super-resolution methods, sparse-coding super-resolution (ScSR) and super-resolution convolutional neural network (SRCNN). With the trained super-resolution methods, the high-resolution image was then reconstructed using the super-resolution methods from a low-resolution image that was down-sampled from the original test image. We compared the image quality of the super-resolution methods and the linear interpolations (nearest neighbor and bilinear interpolations). For quantitative evaluation, we measured two image quality metrics: peak signal-to-noise ratio (PSNR) and structural similarity (SSIM). For comparative evaluation of the super-resolution methods, we measured the computation time per image. Results: The PSNRs and SSIMs for the ScSR and the SRCNN schemes were significantly higher than those of the linear interpolation methods (p p p Conclusion: Super-resolution methods provide significantly better image quality than linear interpolation methods for magnified chest radiograph images. Of the two tested schemes, the SRCNN scheme processed the images fastest;thus, SRCNN could be clinically superior for processing radiographs in terms of both image quality and processing speed.
基金supported by the National Natural Science of Foundation of China(41825011,42030608,42105128,and 42075079)the Opening Foundation of Key Laboratory of Atmospheric Sounding,the CMA and the CMA Research Center on Meteorological Observation Engineering Technology(U2021Z03).
文摘The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral bands,enabling the detection of highly variable aerosol optical depth(AOD).Quantitative retrieval of AOD has hitherto been challenging,especially over land.In this study,an AOD retrieval algorithm is proposed that combines deep learning and transfer learning.The algorithm uses core concepts from both the Dark Target(DT)and Deep Blue(DB)algorithms to select features for the machinelearning(ML)algorithm,allowing for AOD retrieval at 550 nm over both dark and bright surfaces.The algorithm consists of two steps:①A baseline deep neural network(DNN)with skip connections is developed using 10 min Advanced Himawari Imager(AHI)AODs as the target variable,and②sunphotometer AODs from 89 ground-based stations are used to fine-tune the DNN parameters.Out-of-station validation shows that the retrieved AOD attains high accuracy,characterized by a coefficient of determination(R2)of 0.70,a mean bias error(MBE)of 0.03,and a percentage of data within the expected error(EE)of 70.7%.A sensitivity study reveals that the top-of-atmosphere reflectance at 650 and 470 nm,as well as the surface reflectance at 650 nm,are the two largest sources of uncertainty impacting the retrieval.In a case study of monitoring an extreme aerosol event,the AGRI AOD is found to be able to capture the detailed temporal evolution of the event.This work demonstrates the superiority of the transfer-learning technique in satellite AOD retrievals and the applicability of the retrieved AGRI AOD in monitoring extreme pollution events.
基金This research was funded by King Mongkut’s University of Technology North Bangkok(Contract no.KMUTNB-62-KNOW-026).
文摘Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scales.A cul-tural heritage image is one of thefine-grained images because each image has the same similarity in most cases.Using the classification technique,distinguishing cultural heritage architecture may be difficult.This study proposes a cultural heri-tage content retrieval method using adaptive deep learning forfine-grained image retrieval.The key contribution of this research was the creation of a retrieval mod-el that could handle incremental streams of new categories while maintaining its past performance in old categories and not losing the old categorization of a cul-tural heritage image.The goal of the proposed method is to perform a retrieval task for classes.Incremental learning for new classes was conducted to reduce the re-training process.In this step,the original class is not necessary for re-train-ing which we call an adaptive deep learning technique.Cultural heritage in the case of Thai archaeological site architecture was retrieved through machine learn-ing and image processing.We analyze the experimental results of incremental learning forfine-grained images with images of Thai archaeological site architec-ture from world heritage provinces in Thailand,which have a similar architecture.Using afine-grained image retrieval technique for this group of cultural heritage images in a database can solve the problem of a high degree of similarity among categories and a high degree of dissimilarity for a specific category.The proposed method for retrieving the correct image from a database can deliver an average accuracy of 85 percent.Adaptive deep learning forfine-grained image retrieval was used to retrieve cultural heritage content,and it outperformed state-of-the-art methods infine-grained image retrieval.
文摘Recent days,Image retrieval has become a tedious process as the image database has grown very larger.The introduction of Machine Learning(ML)and Deep Learning(DL)made this process more comfortable.In these,the pair-wise label similarity is used tofind the matching images from the database.But this method lacks of limited propose code and weak execution of misclassified images.In order to get-rid of the above problem,a novel triplet based label that incorporates context-spatial similarity measure is proposed.A Point Attention Based Triplet Network(PABTN)is introduced to study propose code that gives maximum discriminative ability.To improve the performance of ranking,a corre-lating resolutions for the classification,triplet labels based onfindings,a spatial-attention mechanism and Region Of Interest(ROI)and small trial information loss containing a new triplet cross-entropy loss are used.From the experimental results,it is shown that the proposed technique exhibits better results in terms of mean Reciprocal Rank(mRR)and mean Average Precision(mAP)in the CIFAR-10 and NUS-WIPE datasets.
文摘The use of massive image databases has increased drastically over the few years due to evolution of multimedia technology.Image retrieval has become one of the vital tools in image processing applications.Content-Based Image Retrieval(CBIR)has been widely used in varied applications.But,the results produced by the usage of a single image feature are not satisfactory.So,multiple image features are used very often for attaining better results.But,fast and effective searching for relevant images from a database becomes a challenging task.In the previous existing system,the CBIR has used the combined feature extraction technique using color auto-correlogram,Rotation-Invariant Uniform Local Binary Patterns(RULBP)and local energy.However,the existing system does not provide significant results in terms of recall and precision.Also,the computational complexity is higher for the existing CBIR systems.In order to handle the above mentioned issues,the Gray Level Co-occurrence Matrix(GLCM)with Deep Learning based Enhanced Convolution Neural Network(DLECNN)is proposed in this work.The proposed system framework includes noise reduction using histogram equalization,feature extraction using GLCM,similarity matching computation using Hierarchal and Fuzzy c-Means(HFCM)algorithm and the image retrieval using DLECNN algorithm.The histogram equalization has been used for computing the image enhancement.This enhanced image has a uniform histogram.Then,the GLCM method has been used to extract the features such as shape,texture,colour,annotations and keywords.The HFCM similarity measure is used for computing the query image vector's similarity index with every database images.For enhancing the performance of this image retrieval approach,the DLECNN algorithm is proposed to retrieve more accurate features of the image.The proposed GLCM+DLECNN algorithm provides better results associated with high accuracy,precision,recall,f-measure and lesser complexity.From the experimental results,it is clearly observed that the proposed system provides efficient image retrieval for the given query image.
基金The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4400271DSR01).
文摘Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of efforts trying to automate the classification operation and retrieve similar images accurately.To reach this goal,we developed a VGG19 deep convolutional neural network to extract the visual features from the images automatically.Then,the distances among the extracted features vectors are measured and a similarity score is generated using a Siamese deep neural network.The Siamese model built and trained at first from scratch but,it didn’t generated high evaluation metrices.Thus,we re-built it from VGG19 pre-trained deep learning model to generate higher evaluation metrices.Afterward,three different distance metrics combined with the Sigmoid activation function are experimented looking for the most accurate method formeasuring the similarities among the retrieved images.Reaching that the highest evaluation parameters generated using the Cosine distance metric.Moreover,the Graphics Processing Unit(GPU)utilized to run the code instead of running it on the Central Processing Unit(CPU).This step optimized the execution further since it expedited both the training and the retrieval time efficiently.After extensive experimentation,we reached satisfactory solution recording 0.98 and 0.99 F-score for the classification and for the retrieval,respectively.
基金supported by the Fundamental Research Funds for the Central Non-profit Research Institution of the Chinese Academy of Forestry (Grant No.CAFYBB2020ZY003)the Key S&T Project of Inner Mongolia (Grant No.2021ZD0041-001-002)the Central Public-interest Scientific Institution Basal Research Fund (Grant No.11024316000202300001)。
文摘Apricot has a long history of cultivation and has many varieties and types. The traditional variety identification methods are timeconsuming and labor-consuming, posing grand challenges to apricot resource management. Tool development in this regard will help researchers quickly identify variety information. This study photographed apricot fruits outdoors and indoors and constructed a dataset that can precisely classify the fruits using a U-net model (F-score:99%), which helps to obtain the fruit's size, shape, and color features. Meanwhile, a variety search engine was constructed, which can search and identify variety from the database according to the above features. Besides, a mobile and web application (ApricotView) was developed, and the construction mode can be also applied to other varieties of fruit trees.Additionally, we have collected four difficult-to-identify seed datasets and used the VGG16 model for training, with an accuracy of 97%, which provided an important basis for ApricotView. To address the difficulties in data collection bottlenecking apricot phenomics research, we developed the first apricot database platform of its kind (ApricotDIAP, http://apricotdiap.com/) to accumulate, manage, and publicize scientific data of apricot.
基金sponsored by the National Natural Science Foundation of China(Grants:62002200,61772319)Shandong Natural Science Foundation of China(Grant:ZR2020QF012).
文摘Given one specific image,it would be quite significant if humanity could simply retrieve all those pictures that fall into a similar category of images.However,traditional methods are inclined to achieve high-quality retrieval by utilizing adequate learning instances,ignoring the extraction of the image’s essential information which leads to difficulty in the retrieval of similar category images just using one reference image.Aiming to solve this problem above,we proposed in this paper one refined sparse representation based similar category image retrieval model.On the one hand,saliency detection and multi-level decomposition could contribute to taking salient and spatial information into consideration more fully in the future.On the other hand,the cross mutual sparse coding model aims to extract the image’s essential feature to the maximumextent possible.At last,we set up a database concluding a large number of multi-source images.Adequate groups of comparative experiments show that our method could contribute to retrieving similar category images effectively.Moreover,adequate groups of ablation experiments show that nearly all procedures play their roles,respectively.
文摘The exponential increase in data over the past fewyears,particularly in images,has led to more complex content since visual representation became the new norm.E-commerce and similar platforms maintain large image catalogues of their products.In image databases,searching and retrieving similar images is still a challenge,even though several image retrieval techniques have been proposed over the decade.Most of these techniques work well when querying general image databases.However,they often fail in domain-specific image databases,especially for datasets with low intraclass variance.This paper proposes a domain-specific image similarity search engine based on a fused deep learning network.The network is comprised of an improved object localization module,a classification module to narrow down search options and finally a feature extraction and similarity calculation module.The network features both an offline stage for indexing the dataset and an online stage for querying.The dataset used to evaluate the performance of the proposed network is a custom domain-specific dataset related to cosmetics packaging gathered from various online platforms.The proposed method addresses the intraclass variance problem with more precise object localization and the introduction of top result reranking based on object contours.Finally,quantitative and qualitative experiment results are presented,showing improved image similarity search performance.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R161)PrincessNourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the|Deanship of Scientific Research at Umm Al-Qura University|for supporting this work by Grant Code:(22UQU4310373DSR33).
文摘The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models such as speech understanding,emotion detection,home automation,and so on.If an image needs to be captioned,then the objects in that image,its actions and connections,and any silent feature that remains under-projected or missing from the images should be identified.The aim of the image captioning process is to generate a caption for image.In next step,the image should be provided with one of the most significant and detailed descriptions that is syntactically as well as semantically correct.In this scenario,computer vision model is used to identify the objects and NLP approaches are followed to describe the image.The current study develops aNatural Language Processing with Optimal Deep Learning Enabled Intelligent Image Captioning System(NLPODL-IICS).The aim of the presented NLPODL-IICS model is to produce a proper description for input image.To attain this,the proposed NLPODL-IICS follows two stages such as encoding and decoding processes.Initially,at the encoding side,the proposed NLPODL-IICS model makes use of Hunger Games Search(HGS)with Neural Search Architecture Network(NASNet)model.This model represents the input data appropriately by inserting it into a predefined length vector.Besides,during decoding phase,Chimp Optimization Algorithm(COA)with deeper Long Short Term Memory(LSTM)approach is followed to concatenate the description sentences 4436 CMC,2023,vol.74,no.2 produced by the method.The application of HGS and COA algorithms helps in accomplishing proper parameter tuning for NASNet and LSTM models respectively.The proposed NLPODL-IICS model was experimentally validated with the help of two benchmark datasets.Awidespread comparative analysis confirmed the superior performance of NLPODL-IICS model over other models.
基金This study was supported by Royal Society International Exchanges Cost Share Award,UK(RP202G0230)Medical Research Council Confidence in Concept Award,UK(MC_PC_17171)+1 种基金Hope Foundation for Cancer Research,UK(RM60G0680)Global Challenges Research Fund(GCRF),UK(P202PF11)。
文摘(Aim)COVID-19 is an ongoing infectious disease.It has caused more than 107.45 m confirmed cases and 2.35 m deaths till 11/Feb/2021.Traditional computer vision methods have achieved promising results on the automatic smart diagnosis.(Method)This study aims to propose a novel deep learning method that can obtain better performance.We use the pseudo-Zernike moment(PZM),derived from Zernike moment,as the extracted features.Two settings are introducing:(i)image plane over unit circle;and(ii)image plane inside the unit circle.Afterward,we use a deep-stacked sparse autoencoder(DSSAE)as the classifier.Besides,multiple-way data augmentation is chosen to overcome overfitting.The multiple-way data augmentation is based on Gaussian noise,salt-and-pepper noise,speckle noise,horizontal and vertical shear,rotation,Gamma correction,random translation and scaling.(Results)10 runs of 10-fold cross validation shows that our PZM-DSSAE method achieves a sensitivity of 92.06%±1.54%,a specificity of 92.56%±1.06%,a precision of 92.53%±1.03%,and an accuracy of 92.31%±1.08%.Its F1 score,MCC,and FMI arrive at 92.29%±1.10%,84.64%±2.15%,and 92.29%±1.10%,respectively.The AUC of our model is 0.9576.(Conclusion)We demonstrate“image plane over unit circle”can get better results than“image plane inside a unit circle.”Besides,this proposed PZM-DSSAE model is better than eight state-of-the-art approaches.
基金Manuscript received February 13, 2016 accepted December 7, 2016. This work was supported by the National Natural Science Foundation of China (61362001, 61661031), Jiangxi Province Innovation Projects for Postgraduate Funds (YC2016-S006), the International Postdoctoral Exchange Fellowship Program, and Jiangxi Advanced Project for Post-Doctoral Research Fund (2014KY02).
基金Supported by the National Natural Science Foundation of China(No.61379014)
文摘In this paper, a discriminative structured dictionary learning algorithm is presented. To enhance the dictionary's discriminative power, the reconstruction error, classification error and inhomogeneous representation error are integrated into the objective function. The proposed approach learns a single structured dictionary and a linear classifier jointly. The learned dictionary encourages the samples from the same class to have similar sparse codes, and the samples from different classes to have dissimilar sparse codes. The solution to the objective function is achieved by employing a feature-sign search algorithm and Lagrange dual method. Experimental results on three public databases demonstrate that the proposed approach outperforms several recently proposed dictionary learning techniques for classification.
基金Supported by the National Natural Science Foundation of China(No.61261010No.61362001+7 种基金No.61365013No.61262084No.51165033)Technology Foundation of Department of Education in Jiangxi Province(GJJ13061GJJ14196)Young Scientists Training Plan of Jiangxi Province(No.20133ACB21007No.20142BCB23001)National Post-Doctoral Research Fund(No.2014M551867)and Jiangxi Advanced Project for Post-Doctoral Research Fund(No.2014KY02)
文摘In this paper, a two-level Bregman method is presented with graph regularized sparse coding for highly undersampled magnetic resonance image reconstruction. The graph regularized sparse coding is incorporated with the two-level Bregman iterative procedure which enforces the sampled data constraints in the outer level and updates dictionary and sparse representation in the inner level. Graph regularized sparse coding and simple dictionary updating applied in the inner minimization make the proposed algorithm converge with a relatively small number of iterations. Experimental results demonstrate that the proposed algorithm can consistently reconstruct both simulated MR images and real MR data efficiently, and outperforms the current state-of-the-art approaches in terms of visual comparisons and quantitative measures.
文摘Over recent years, Convolutional Neural Networks (CNN) has improved performance on practically every image-based task, including Content-Based Image Retrieval (CBIR). Nevertheless, since features of CNN have altered orientation, training a CBIR system to detect and correct the angle is complex. While it is possible to construct rotation-invariant features by hand, retrieval accuracy will be low because hand engineering only creates low-level features, while deep learning methods build high-level and low-level features simultaneously. This paper presents a novel approach that combines a deep learning orientation angle detection model with the CBIR feature extraction model to correct the rotation angle of any image. This offers a unique construction of a rotation-invariant CBIR system that handles the CNN features that are not rotation invariant. This research also proposes a further study on how a rotation-invariant deep CBIR can recover images from the dataset in real-time. The final results of this system show significant improvement as compared to a default CNN feature extraction model without the OAD.
基金Supported by the Zhejiang Provincial"Jianbing"and"Lingyan"R&D Programs(2023C03012,2024C01126)。
文摘The encoding aperture snapshot spectral imaging system,based on the compressive sensing theory,can be regarded as an encoder,which can efficiently obtain compressed two-dimensional spectral data and then decode it into three-dimensional spectral data through deep neural networks.However,training the deep neural net⁃works requires a large amount of clean data that is difficult to obtain.To address the problem of insufficient training data for deep neural networks,a self-supervised hyperspectral denoising neural network based on neighbor⁃hood sampling is proposed.This network is integrated into a deep plug-and-play framework to achieve self-supervised spectral reconstruction.The study also examines the impact of different noise degradation models on the fi⁃nal reconstruction quality.Experimental results demonstrate that the self-supervised learning method enhances the average peak signal-to-noise ratio by 1.18 dB and improves the structural similarity by 0.009 compared with the supervised learning method.Additionally,it achieves better visual reconstruction results.
基金supported by the National Science Foundation(Grant No.1846784).
文摘Deep learning has transformed computational imaging,but traditional pixel-based representations limit their ability to capture continuous multiscale object features.Addressing this gap,we introduce a local conditional neural field(LCNF)framework,which leverages a continuous neural representation to provide flexible object representations.LCNF’s unique capabilities are demonstrated in solving the highly ill-posed phase retrieval problem of multiplexed Fourier ptychographic microscopy.Our network,termed neural phase retrieval(NeuPh),enables continuous-domain resolution-enhanced phase reconstruction,offering scalability,robustness,accuracy,and generalizability that outperform existing methods.NeuPh integrates a local conditional neural representation and a coordinate-based training strategy.We show that NeuPh can accurately reconstruct high-resolution phase images from low-resolution intensity measurements.Furthermore,NeuPh consistently applies continuous object priors and effectively eliminates various phase artifacts,demonstrating robustness even when trained on imperfect datasets.Moreover,NeuPh improves accuracy and generalization compared with existing deep learning models.We further investigate a hybrid training strategy combining both experimental and simulated datasets,elucidating the impact of domain shift between experiment and simulation.Our work underscores the potential of the LCNF framework in solving complex large-scale inverse problems,opening up new possibilities for deep-learning-based imaging techniques.