Automatic segmentation and classification of brain tumors are of great importance to clinical treatment.However,they are challenging due to the varied and small morphology of the tumors.In this paper,we propose a mult...Automatic segmentation and classification of brain tumors are of great importance to clinical treatment.However,they are challenging due to the varied and small morphology of the tumors.In this paper,we propose a multitask multiscale residual attention network(MMRAN)to simultaneously solve the problem of accurately segmenting and classifying brain tumors.The proposed MMRAN is based on U-Net,and a parallel branch is added at the end of the encoder as the classification network.First,we propose a novel multiscale residual attention module(MRAM)that can aggregate contextual features and combine channel attention and spatial attention better and add it to the shared parameter layer of MMRAN.Second,we propose a method of dynamic weight training that can improve model performance while minimizing the need for multiple experiments to determine the optimal weights for each task.Finally,prior knowledge of brain tumors is added to the postprocessing of segmented images to further improve the segmentation accuracy.We evaluated MMRAN on a brain tumor data set containing meningioma,glioma,and pituitary tumors.In terms of segmentation performance,our method achieves Dice,Hausdorff distance(HD),mean intersection over union(MIoU),and mean pixel accuracy(MPA)values of 80.03%,6.649 mm,84.38%,and 89.41%,respectively.In terms of classification performance,our method achieves accuracy,recall,precision,and F1-score of 89.87%,90.44%,88.56%,and 89.49%,respectively.Compared with other networks,MMRAN performs better in segmentation and classification,which significantly aids medical professionals in brain tumor management.The code and data set are available at https://github.com/linkenfaqiu/MMRAN.展开更多
Aim:This study aims to establish an artificial intelligence model,ThyroidNet,to diagnose thyroid nodules using deep learning techniques accurately.Methods:A novel method,ThyroidNet,is introduced and evaluated based on...Aim:This study aims to establish an artificial intelligence model,ThyroidNet,to diagnose thyroid nodules using deep learning techniques accurately.Methods:A novel method,ThyroidNet,is introduced and evaluated based on deep learning for the localization and classification of thyroid nodules.First,we propose the multitask TransUnet,which combines the TransUnet encoder and decoder with multitask learning.Second,we propose the DualLoss function,tailored to the thyroid nodule localization and classification tasks.It balances the learning of the localization and classification tasks to help improve the model’s generalization ability.Third,we introduce strategies for augmenting the data.Finally,we submit a novel deep learning model,ThyroidNet,to accurately detect thyroid nodules.Results:ThyroidNet was evaluated on private datasets and was comparable to other existing methods,including U-Net and TransUnet.Experimental results show that ThyroidNet outperformed these methods in localizing and classifying thyroid nodules.It achieved improved accuracy of 3.9%and 1.5%,respectively.Conclusion:ThyroidNet significantly improves the clinical diagnosis of thyroid nodules and supports medical image analysis tasks.Future research directions include optimization of the model structure,expansion of the dataset size,reduction of computational complexity and memory requirements,and exploration of additional applications of ThyroidNet in medical image analysis.展开更多
Due to the different signal-to-noise ratio(SNR)of each subchannel,the bit error rate(BER)of hybrid precoding based on singular value decomposition(SVD)decreases.In this paper,we propose a multi-task learning based pre...Due to the different signal-to-noise ratio(SNR)of each subchannel,the bit error rate(BER)of hybrid precoding based on singular value decomposition(SVD)decreases.In this paper,we propose a multi-task learning based precoding network(PN)model to solve the BER loss problem caused by SVD based hybrid precoding under imperfect channel state information(CSI).Specifically,we firstly generate a dataset including imcomplete CSI input channel matrix and corresponding output labels to train the PN model.The output labels are designed based on uniform channel decomposition(UCD)which decomposes the channel into multiple subchannels with same gain,while the vertical-bell layered space-time structure(V-BLAST)signal processing technology is combined to eliminate the inner interference of the subchannels.Then,the PN model is trained to design the analog and digital precoding/combining matrix simultaneous.Simulation results show that the proposed scheme has only negligible gap in spectrum efficiency compared with the fully digital precoding,while achieves better BER performance than SVD based hybrid precoding.展开更多
Due to the complexity of data,interpretation of pattern or extraction of information becomes difficult;therefore application of machine learning is used to teach machines how to handle data more efficiently.With the i...Due to the complexity of data,interpretation of pattern or extraction of information becomes difficult;therefore application of machine learning is used to teach machines how to handle data more efficiently.With the increase of datasets,various organizations now apply machine learning applications and algorithms.Many industries apply machine learning to extract relevant information for analysis purposes.Many scholars,mathematicians and programmers have carried out research and applied several machine learning approaches in order to find solution to problems.In this paper,we focus on general review of machine learning including various machine learning techniques.These techniques can be applied to different fields like image processing,data mining,predictive analysis and so on.The paper aims at reviewing machine learning techniques and algorithms.The research methodology is based on qualitative analysis where various literatures is being reviewed based on machine learning.展开更多
To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentat...To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentation,lane detection,and traffic object detection.Firstly,in the encoding stage,features are extracted,and Generalized Efficient Layer Aggregation Network(GELAN)is utilized to enhance feature extraction and gradient flow.Secondly,in the decoding stage,specialized detection heads are designed;the drivable area segmentation head employs DySample to expand feature maps,the lane detection head merges early-stage features and processes the output through the Focal Modulation Network(FMN).Lastly,the Minimum Point Distance IoU(MPDIoU)loss function is employed to compute the matching degree between traffic object detection boxes and predicted boxes,facilitating model training adjustments.Experimental results on the BDD100K dataset demonstrate that the proposed network achieves a drivable area segmentation mean intersection over union(mIoU)of 92.2%,lane detection accuracy and intersection over union(IoU)of 75.3%and 26.4%,respectively,and traffic object detection recall and mAP of 89.7%and 78.2%,respectively.The detection performance surpasses that of other single-task or multi-task algorithm models.展开更多
In order to accurately segment architectural features in highresolution remote sensing images,a semantic segmentation method based on U-net network multi-task learning is proposed.First,a boundary distance map was gen...In order to accurately segment architectural features in highresolution remote sensing images,a semantic segmentation method based on U-net network multi-task learning is proposed.First,a boundary distance map was generated based on the remote sensing image of the ground truth map of the building.The remote sensing image and its truth map were used as the input in the U-net network,followed by the addition of the building ground prediction layer at the end of the U-net network.Based on the ResNet network,a multi-task network with the boundary distance prediction layer was built.Experiments involving the ISPRS aerial remote sensing image building and feature annotation data set show that compared with the full convolutional network combined with the multi-layer perceptron method,the intersection ratio of VGG16 network,VGG16+boundary prediction,ResNet50 and the method in this paper were increased by 5.15%,6.946%,6.41%and 7.86%.The accuracy of the networks was increased to 94.71%,95.39%,95.30%and 96.10%respectively,which resulted in high-precision extraction of building features.展开更多
Previous studies have shown that amnestic mild cognitive impairment(aMCI)involves in the morphological abnormalities of multiple regions,including cortical thickness,sulcus depth,surface area,gray matter volume,jacobi...Previous studies have shown that amnestic mild cognitive impairment(aMCI)involves in the morphological abnormalities of multiple regions,including cortical thickness,sulcus depth,surface area,gray matter volume,jacobian metric and average curvature.All the measures have unique neuropathological and genetic meanings.However,most existing methods simply average or concatenate these measures when constructing the classifiers,which may include redundant information and ignore the relationships among them.In this study,we treat each measure as a task in our multitask learning framework.Considering the actual situation that we do not know the correlation between tasks in advance,we use a robust multitask feature learning(rMTFL)method to select a group of features among correlated measures and provide additional information by identifying outlier tasks at the same time.Then,we train several SVM classifiers and for each measure,we input the selected features into the corresponding SVM classifier.Finally,we use an ensemble classification strategy to combine the results of these classifiers based on the accuracy to make the final prediction.We use the leave-one-out cross-validation to evaluate our proposed method with 46 amnestic mild cognitive impairment(aMCI)and 52 normal controls(NC).The results show that rMTFL algorithm is superior to the group lasso method and average curvature is the outlier task based on multidimensional surface measures.展开更多
This paper deals with Hermite learning which aims at obtaining the target function from the samples of function values and the gradient values. Error analysis is conducted for these algorithms by means of approaches f...This paper deals with Hermite learning which aims at obtaining the target function from the samples of function values and the gradient values. Error analysis is conducted for these algorithms by means of approaches from convex analysis in the frame- work of multi-task vector learning and the improved learning rates are derived.展开更多
The aspect-based sentiment analysis(ABSA) consists of two subtasks—aspect term extraction and aspect sentiment prediction. Existing methods deal with both subtasks one by one in a pipeline manner, in which there lies...The aspect-based sentiment analysis(ABSA) consists of two subtasks—aspect term extraction and aspect sentiment prediction. Existing methods deal with both subtasks one by one in a pipeline manner, in which there lies some problems in performance and real application. This study investigates the end-to-end ABSA and proposes a novel multitask multiview network(MTMVN) architecture. Specifically, the architecture takes the unified ABSA as the main task with the two subtasks as auxiliary tasks. Meanwhile, the representation obtained from the branch network of the main task is regarded as the global view, whereas the representations of the two subtasks are considered two local views with different emphases. Through multitask learning, the main task can be facilitated by additional accurate aspect boundary information and sentiment polarity information. By enhancing the correlations between the views under the idea of multiview learning, the representation of the global view can be optimized to improve the overall performance of the model. The experimental results on three benchmark datasets show that the proposed method exceeds the existing pipeline methods and end-to-end methods, proving the superiority of our MTMVN architecture.展开更多
Intelligent Financial Advisors(IFAs)in online financial applications(apps)have brought new life to personal investment by providing appropriate and high-quality portfolios for users.In real-world scenarios,identifying...Intelligent Financial Advisors(IFAs)in online financial applications(apps)have brought new life to personal investment by providing appropriate and high-quality portfolios for users.In real-world scenarios,identifying potential clients is a crucial issue for IFAs,i.e.,identifying users who are willing to purchase the portfolios.Thus,extracting useful information from various characteristics of users and further predicting their purchase inclination are urgent.However,two critical problems encountered in real practice make this prediction task challenging,i.e.,sample selection bias and data sparsity.In this study,we formalize a potential conversion relationship,i.e.,user→activated user→client and decompose this relationship into three related tasks.Then,we propose a Multitask Feature Extraction Model(MFEM),which can leverage useful information contained in these related tasks and learn them jointly,thereby solving the two problems simultaneously.In addition,we design a two-stage feature selection algorithm to select highly relevant user features efficiently and accurately from an incredibly huge number of user feature fields.Finally,we conduct extensive experiments on a real-world dataset provided by a famous fintech bank.Experimental results clearly demonstrate the effectiveness of MFEM.展开更多
Orbital angular momentums(OAMs)greatly enhance the channel capacity in free-space optical communication.However,demodulation of superposed OAM to recognize them separately is always difficult,especially upon multiplex...Orbital angular momentums(OAMs)greatly enhance the channel capacity in free-space optical communication.However,demodulation of superposed OAM to recognize them separately is always difficult,especially upon multiplexing more OAMs.In this work,we report a directly recognition of multiplexed fractional OAM modes,without separating them,at a resolution of 0.1 with high accuracy,using a multi-task deep learning(MTDL)model,which has not been reported before.Namely,two-mode,four-mode,and eight-mode superposed OAM beams,experimentally generated with a hologram carrying both phase and amplitude information,are well recognized by the suitable MTDL model.Two applications in information transmission are presented:the first is for 256-ary OAM shift keying via multiplexed fractional OAMs;the second is for OAM division multiplexed information transmission in an eightfold speed.The encouraging results will expand the capacity in future free-space optical communication.展开更多
A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panopti...A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panoptic driving perception network(you only look once for panoptic(YOLOP))to perform traffic object detection,drivable area segmentation,and lane detection simultaneously.It is composed of one encoder for feature extraction and three decoders to handle the specific tasks.Our model performs extremely well on the challenging BDD100K dataset,achieving state-of-the-art on all three tasks in terms of accuracy and speed.Besides,we verify the effectiveness of our multi-task learning model for joint training via ablative studies.To our best knowledge,this is the first work that can process these three visual perception tasks simultaneously in real-time on an embedded device Jetson TX2(23 FPS),and maintain excellent accuracy.To facilitate further research,the source codes and pre-trained models are released at https://github.com/hustvl/YOLOP.展开更多
基金This paper was supported by National Natural Science Foundation of China(No.61977063 and 61872020).The authors thank all the patients for providing their MRI images and School of Biomedical Engineering at Southern Medical University,China for providing the brain tumor data set.We appreciate Dr.Fenfen Li,Wenzhou Eye Hospital,Wenzhou Medical University,China,for her support with clinical consulting and language editing.
文摘Automatic segmentation and classification of brain tumors are of great importance to clinical treatment.However,they are challenging due to the varied and small morphology of the tumors.In this paper,we propose a multitask multiscale residual attention network(MMRAN)to simultaneously solve the problem of accurately segmenting and classifying brain tumors.The proposed MMRAN is based on U-Net,and a parallel branch is added at the end of the encoder as the classification network.First,we propose a novel multiscale residual attention module(MRAM)that can aggregate contextual features and combine channel attention and spatial attention better and add it to the shared parameter layer of MMRAN.Second,we propose a method of dynamic weight training that can improve model performance while minimizing the need for multiple experiments to determine the optimal weights for each task.Finally,prior knowledge of brain tumors is added to the postprocessing of segmented images to further improve the segmentation accuracy.We evaluated MMRAN on a brain tumor data set containing meningioma,glioma,and pituitary tumors.In terms of segmentation performance,our method achieves Dice,Hausdorff distance(HD),mean intersection over union(MIoU),and mean pixel accuracy(MPA)values of 80.03%,6.649 mm,84.38%,and 89.41%,respectively.In terms of classification performance,our method achieves accuracy,recall,precision,and F1-score of 89.87%,90.44%,88.56%,and 89.49%,respectively.Compared with other networks,MMRAN performs better in segmentation and classification,which significantly aids medical professionals in brain tumor management.The code and data set are available at https://github.com/linkenfaqiu/MMRAN.
基金supported by MRC,UK (MC_PC_17171)Royal Society,UK (RP202G0230)+8 种基金BHF,UK (AA/18/3/34220)Hope Foundation for Cancer Research,UK (RM60G0680)GCRF,UK (P202PF11)Sino-UK Industrial Fund,UK (RP202G0289)LIAS,UK (P202ED10,P202RE969)Data Science Enhancement Fund,UK (P202RE237)Fight for Sight,UK (24NN201)Sino-UK Education Fund,UK (OP202006)BBSRC,UK (RM32G0178B8).
文摘Aim:This study aims to establish an artificial intelligence model,ThyroidNet,to diagnose thyroid nodules using deep learning techniques accurately.Methods:A novel method,ThyroidNet,is introduced and evaluated based on deep learning for the localization and classification of thyroid nodules.First,we propose the multitask TransUnet,which combines the TransUnet encoder and decoder with multitask learning.Second,we propose the DualLoss function,tailored to the thyroid nodule localization and classification tasks.It balances the learning of the localization and classification tasks to help improve the model’s generalization ability.Third,we introduce strategies for augmenting the data.Finally,we submit a novel deep learning model,ThyroidNet,to accurately detect thyroid nodules.Results:ThyroidNet was evaluated on private datasets and was comparable to other existing methods,including U-Net and TransUnet.Experimental results show that ThyroidNet outperformed these methods in localizing and classifying thyroid nodules.It achieved improved accuracy of 3.9%and 1.5%,respectively.Conclusion:ThyroidNet significantly improves the clinical diagnosis of thyroid nodules and supports medical image analysis tasks.Future research directions include optimization of the model structure,expansion of the dataset size,reduction of computational complexity and memory requirements,and exploration of additional applications of ThyroidNet in medical image analysis.
基金supported by the National Natural Science Foundation of China under grant No.61379028 and No.61671483The Natural Science Foundation of Hubei province under grant No.2016CFA089+1 种基金The Fundamental Research Funds for the Central UniversitiesSouth-central University for Nationalities under grant NO.CZY19003。
文摘Due to the different signal-to-noise ratio(SNR)of each subchannel,the bit error rate(BER)of hybrid precoding based on singular value decomposition(SVD)decreases.In this paper,we propose a multi-task learning based precoding network(PN)model to solve the BER loss problem caused by SVD based hybrid precoding under imperfect channel state information(CSI).Specifically,we firstly generate a dataset including imcomplete CSI input channel matrix and corresponding output labels to train the PN model.The output labels are designed based on uniform channel decomposition(UCD)which decomposes the channel into multiple subchannels with same gain,while the vertical-bell layered space-time structure(V-BLAST)signal processing technology is combined to eliminate the inner interference of the subchannels.Then,the PN model is trained to design the analog and digital precoding/combining matrix simultaneous.Simulation results show that the proposed scheme has only negligible gap in spectrum efficiency compared with the fully digital precoding,while achieves better BER performance than SVD based hybrid precoding.
文摘Due to the complexity of data,interpretation of pattern or extraction of information becomes difficult;therefore application of machine learning is used to teach machines how to handle data more efficiently.With the increase of datasets,various organizations now apply machine learning applications and algorithms.Many industries apply machine learning to extract relevant information for analysis purposes.Many scholars,mathematicians and programmers have carried out research and applied several machine learning approaches in order to find solution to problems.In this paper,we focus on general review of machine learning including various machine learning techniques.These techniques can be applied to different fields like image processing,data mining,predictive analysis and so on.The paper aims at reviewing machine learning techniques and algorithms.The research methodology is based on qualitative analysis where various literatures is being reviewed based on machine learning.
文摘To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentation,lane detection,and traffic object detection.Firstly,in the encoding stage,features are extracted,and Generalized Efficient Layer Aggregation Network(GELAN)is utilized to enhance feature extraction and gradient flow.Secondly,in the decoding stage,specialized detection heads are designed;the drivable area segmentation head employs DySample to expand feature maps,the lane detection head merges early-stage features and processes the output through the Focal Modulation Network(FMN).Lastly,the Minimum Point Distance IoU(MPDIoU)loss function is employed to compute the matching degree between traffic object detection boxes and predicted boxes,facilitating model training adjustments.Experimental results on the BDD100K dataset demonstrate that the proposed network achieves a drivable area segmentation mean intersection over union(mIoU)of 92.2%,lane detection accuracy and intersection over union(IoU)of 75.3%and 26.4%,respectively,and traffic object detection recall and mAP of 89.7%and 78.2%,respectively.The detection performance surpasses that of other single-task or multi-task algorithm models.
基金This research was supported by National Key Research and Development program[2018YFF0213606-03(Mu,Y.,Hu,T.L.,Gong,H.,Li,S.J.and Sun,Y.H.)http://www.most.gov.cn]the Jilin Province Science and Technology Development Plan focusing on research and development projects[20200402006NC(Mu,Y.,Hu,T.L.,Gong,H.and Li,S.J.)http://kjt.jl.gov.cn]+1 种基金the science and technology support project for key industries in southern Xinjiang[2018DB001(Gong,H.,and Li,S.J.)http://kjj.xjbt.gov.cn]the key technology R&D project of Changchun Science and Technology Bureau of Jilin Province[21ZGN29(Mu,Y.,Bao,H.P.,Wang X.B.)http://kjj.changchun.gov.cn].
文摘In order to accurately segment architectural features in highresolution remote sensing images,a semantic segmentation method based on U-net network multi-task learning is proposed.First,a boundary distance map was generated based on the remote sensing image of the ground truth map of the building.The remote sensing image and its truth map were used as the input in the U-net network,followed by the addition of the building ground prediction layer at the end of the U-net network.Based on the ResNet network,a multi-task network with the boundary distance prediction layer was built.Experiments involving the ISPRS aerial remote sensing image building and feature annotation data set show that compared with the full convolutional network combined with the multi-layer perceptron method,the intersection ratio of VGG16 network,VGG16+boundary prediction,ResNet50 and the method in this paper were increased by 5.15%,6.946%,6.41%and 7.86%.The accuracy of the networks was increased to 94.71%,95.39%,95.30%and 96.10%respectively,which resulted in high-precision extraction of building features.
基金supported by the National Key Research and Development Program of China(2016YFC1306300)the National Natural Science Foundation of China(Grant No.61633018,81622025 and 81471731)Beijing Municipal Commission of Health and Family Planning(PXM2019_026283_000002)。
文摘Previous studies have shown that amnestic mild cognitive impairment(aMCI)involves in the morphological abnormalities of multiple regions,including cortical thickness,sulcus depth,surface area,gray matter volume,jacobian metric and average curvature.All the measures have unique neuropathological and genetic meanings.However,most existing methods simply average or concatenate these measures when constructing the classifiers,which may include redundant information and ignore the relationships among them.In this study,we treat each measure as a task in our multitask learning framework.Considering the actual situation that we do not know the correlation between tasks in advance,we use a robust multitask feature learning(rMTFL)method to select a group of features among correlated measures and provide additional information by identifying outlier tasks at the same time.Then,we train several SVM classifiers and for each measure,we input the selected features into the corresponding SVM classifier.Finally,we use an ensemble classification strategy to combine the results of these classifiers based on the accuracy to make the final prediction.We use the leave-one-out cross-validation to evaluate our proposed method with 46 amnestic mild cognitive impairment(aMCI)and 52 normal controls(NC).The results show that rMTFL algorithm is superior to the group lasso method and average curvature is the outlier task based on multidimensional surface measures.
基金supported by the National Natural Science Foundation of China(No.11471292)
文摘This paper deals with Hermite learning which aims at obtaining the target function from the samples of function values and the gradient values. Error analysis is conducted for these algorithms by means of approaches from convex analysis in the frame- work of multi-task vector learning and the improved learning rates are derived.
基金supported by the National Natural Science Foundation of China(No.61976247)
文摘The aspect-based sentiment analysis(ABSA) consists of two subtasks—aspect term extraction and aspect sentiment prediction. Existing methods deal with both subtasks one by one in a pipeline manner, in which there lies some problems in performance and real application. This study investigates the end-to-end ABSA and proposes a novel multitask multiview network(MTMVN) architecture. Specifically, the architecture takes the unified ABSA as the main task with the two subtasks as auxiliary tasks. Meanwhile, the representation obtained from the branch network of the main task is regarded as the global view, whereas the representations of the two subtasks are considered two local views with different emphases. Through multitask learning, the main task can be facilitated by additional accurate aspect boundary information and sentiment polarity information. By enhancing the correlations between the views under the idea of multiview learning, the representation of the global view can be optimized to improve the overall performance of the model. The experimental results on three benchmark datasets show that the proposed method exceeds the existing pipeline methods and end-to-end methods, proving the superiority of our MTMVN architecture.
基金partially supported by the National Key Research and Development Program of China(No.2018YFC0832101)the National Natural Science Foundation of China(Nos.71802068,61922073,and U20A20229)+1 种基金the financial supports of Tianjin University(No.2020XSC-0019)the support of USTC-CMB Joint Laboratory of Artificial Intelligence
文摘Intelligent Financial Advisors(IFAs)in online financial applications(apps)have brought new life to personal investment by providing appropriate and high-quality portfolios for users.In real-world scenarios,identifying potential clients is a crucial issue for IFAs,i.e.,identifying users who are willing to purchase the portfolios.Thus,extracting useful information from various characteristics of users and further predicting their purchase inclination are urgent.However,two critical problems encountered in real practice make this prediction task challenging,i.e.,sample selection bias and data sparsity.In this study,we formalize a potential conversion relationship,i.e.,user→activated user→client and decompose this relationship into three related tasks.Then,we propose a Multitask Feature Extraction Model(MFEM),which can leverage useful information contained in these related tasks and learn them jointly,thereby solving the two problems simultaneously.In addition,we design a two-stage feature selection algorithm to select highly relevant user features efficiently and accurately from an incredibly huge number of user feature fields.Finally,we conduct extensive experiments on a real-world dataset provided by a famous fintech bank.Experimental results clearly demonstrate the effectiveness of MFEM.
基金Financial supports are from the National Natural Science Foundation of China(Grant Nos.12174115,91836103,and 11834003).
文摘Orbital angular momentums(OAMs)greatly enhance the channel capacity in free-space optical communication.However,demodulation of superposed OAM to recognize them separately is always difficult,especially upon multiplexing more OAMs.In this work,we report a directly recognition of multiplexed fractional OAM modes,without separating them,at a resolution of 0.1 with high accuracy,using a multi-task deep learning(MTDL)model,which has not been reported before.Namely,two-mode,four-mode,and eight-mode superposed OAM beams,experimentally generated with a hologram carrying both phase and amplitude information,are well recognized by the suitable MTDL model.Two applications in information transmission are presented:the first is for 256-ary OAM shift keying via multiplexed fractional OAMs;the second is for OAM division multiplexed information transmission in an eightfold speed.The encouraging results will expand the capacity in future free-space optical communication.
基金supported by National Natural Science Foundation of China(Nos.61876212 and 1733007)Zhejiang Laboratory,China(No.2019NB0AB02)Hubei Province College Students Innovation and Entrepreneurship Training Program,China(No.S202010487058).
文摘A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panoptic driving perception network(you only look once for panoptic(YOLOP))to perform traffic object detection,drivable area segmentation,and lane detection simultaneously.It is composed of one encoder for feature extraction and three decoders to handle the specific tasks.Our model performs extremely well on the challenging BDD100K dataset,achieving state-of-the-art on all three tasks in terms of accuracy and speed.Besides,we verify the effectiveness of our multi-task learning model for joint training via ablative studies.To our best knowledge,this is the first work that can process these three visual perception tasks simultaneously in real-time on an embedded device Jetson TX2(23 FPS),and maintain excellent accuracy.To facilitate further research,the source codes and pre-trained models are released at https://github.com/hustvl/YOLOP.