PM2.5 concentration prediction is of great significance to environmental protection and human health.Achieving accurate prediction of PM2.5 concentration has become an important research task.However,PM2.5 pollutants ...PM2.5 concentration prediction is of great significance to environmental protection and human health.Achieving accurate prediction of PM2.5 concentration has become an important research task.However,PM2.5 pollutants can spread in the earth’s atmosphere,causing mutual influence between different cities.To effectively capture the air pollution relationship between cities,this paper proposes a novel spatiotemporal model combining graph attention neural network(GAT)and gated recurrent unit(GRU),named GAT-GRU for PM2.5 concentration prediction.Specifically,GAT is used to learn the spatial dependence of PM2.5 concentration data in different cities,and GRU is to extract the temporal dependence of the long-term data series.The proposed model integrates the learned spatio-temporal dependencies to capture long-term complex spatio-temporal features.Considering that air pollution is related to the meteorological conditions of the city,the knowledge acquired from meteorological data is used in the model to enhance PM2.5 prediction performance.The input of the GAT-GRU model consists of PM2.5 concentration data and meteorological data.In order to verify the effectiveness of the proposed GAT-GRU prediction model,this paper designs experiments on real-world datasets compared with other baselines.Experimental results prove that our model achieves excellent performance in PM2.5 concentration prediction.展开更多
Inferring causal protein signalling networks from human immune system cell data is a promising approach to unravel the underlying tissue signalling biology and dysfunction in diseased cells,which has attracted conside...Inferring causal protein signalling networks from human immune system cell data is a promising approach to unravel the underlying tissue signalling biology and dysfunction in diseased cells,which has attracted considerable attention within the bioinformatics field.Recently,Bayesian network(BN)techniques have gained significant popularity in inferring causal protein signalling networks from multiparameter single-cell data.However,current BN methods may exhibit high computational complexity and ignore interactions among protein signalling molecules from different single cells.A novel BN method is presented for learning causal protein signalling networks based on parallel discrete artificial bee colony(PDABC),named PDABC.Specifically,PDABC is a score-based BN method that utilises the parallel artificial bee colony to search for the global optimal causal protein signalling networks with the highest discrete K2 metric.The experimental results on several simulated datasets,as well as a previously published multi-parameter fluorescence-activated cell sorter dataset,indicate that PDABC surpasses the existing state-of-the-art methods in terms of performance and computational efficiency.展开更多
因为它的 insensitivity, 3D 脸识别吸引越来越多的注意到照明和姿势的变化。有在这个话题要解决的许多关键问题,例如 3D 脸表示和有效多特征熔化。在这份报纸,一个新奇 3D 脸识别算法被建议,它的性能在 BJUT-3D 脸数据库上被表明...因为它的 insensitivity, 3D 脸识别吸引越来越多的注意到照明和姿势的变化。有在这个话题要解决的许多关键问题,例如 3D 脸表示和有效多特征熔化。在这份报纸,一个新奇 3D 脸识别算法被建议,它的性能在 BJUT-3D 脸数据库上被表明。这个算法选择脸表面性质和相对关系矩阵的原则部件为脸表示特征。为每个特征的类似公制被定义。特征熔化策略被建议。它基于菲希尔是线性加权的策略线性判别式分析。最后,介绍算法在 BJUT-3D 脸数据库上被测试。算法和熔化策略的表演是令人满意的,这被结束。展开更多
Among the human users of the Internet of Things,the hearing-impaired is a special group of people for whom normal information expression forms,such as voice and video are unaccessible,and most of them have some diffic...Among the human users of the Internet of Things,the hearing-impaired is a special group of people for whom normal information expression forms,such as voice and video are unaccessible,and most of them have some difficulty in understanding information in text form.The hearing-impaired are accustomed to receiving information expressed in sign language.For this situation,a new information expression form for the Internet of Things oriented toward the hearing-impaired is proposed in this paper,and the new expression is based on sign language video synthesis.Under the sign synthesis frame,three modules are necessary:constructing database,searching for appropriate sign language video units and transition units,and generating interpolated frames.With this method,text information could be transformed into sign language expression for the hearing-impaired.展开更多
Three-dimensional(3D)reconstruction of shapes is an important research topic in the fields of computer vision,computer graphics,pattern recognition,and virtual reality.Existing 3D reconstruction methods usually suffer...Three-dimensional(3D)reconstruction of shapes is an important research topic in the fields of computer vision,computer graphics,pattern recognition,and virtual reality.Existing 3D reconstruction methods usually suffer from two bottlenecks:(1)they involve multiple manually designed states which can lead to cumulative errors,but can hardly learn semantic features of 3D shapes automatically;(2)they depend heavily on the content and quality of images,as well as precisely calibrated cameras.As a result,it is difficult to improve the reconstruction accuracy of those methods.3D reconstruction methods based on deep learning overcome both of these bottlenecks by automatically learning semantic features of 3D shapes from low-quality images using deep networks.However,while these methods have various architectures,in-depth analysis and comparisons of them are unavailable so far.We present a comprehensive survey of 3D reconstruction methods based on deep learning.First,based on different deep learning model architectures,we divide 3D reconstruction methods based on deep learning into four types,recurrent neural network,deep autoencoder,generative adversarial network,and convolutional neural network based methods,and analyze the corresponding methodologies carefully.Second,we investigate four representative databases that are commonly used by the above methods in detail.Third,we give a comprehensive comparison of 3D reconstruction methods based on deep learning,which consists of the results of different methods with respect to the same database,the results of each method with respect to different databases,and the robustness of each method with respect to the number of views.Finally,we discuss future development of 3D reconstruction methods based on deep learning.展开更多
In image restoration,we usually assume that the underlying image has a good sparse approximation under a certain system.Wavelet tight frame system has been proven to be such an efficient system to sparsely approximate...In image restoration,we usually assume that the underlying image has a good sparse approximation under a certain system.Wavelet tight frame system has been proven to be such an efficient system to sparsely approximate piecewise smooth images.Thus,it has been widely used in many practical image restoration problems.However,images from different scenarios are so diverse that no static wavelet tight frame system can sparsely approximate all of themwell.To overcome this,recently,Cai et.al.(Appl Comput Harmon Anal 37:89–105,2014)proposed a method that derives a data-driven tight frame adapted to the specific input image,leading to a better sparse approximation.The data-driven tight frame has been applied successfully to image denoising and CT image reconstruction.In this paper,we extend this data-driven tight frame construction method to multi-channel images.We construct a discrete tight frame system for each channel and assume their sparse coefficients have a joint sparsity.The multi-channel data-driven tight frame construction scheme is applied to joint color and depth image reconstruction.Experimental results show that the proposed approach has a better performance than state-of-the-art joint color and depth image reconstruction approaches.展开更多
We propose a framework of hand articulation detection from a monocular depth image using curvature scale space(CSS) descriptors. We extract the hand contour from an input depth image, and obtain the fingertips and fin...We propose a framework of hand articulation detection from a monocular depth image using curvature scale space(CSS) descriptors. We extract the hand contour from an input depth image, and obtain the fingertips and finger-valleys of the contour using the local extrema of a modified CSS map of the contour. Then we recover the undetected fingertips according to the local change of depths of points in the interior of the contour. Compared with traditional appearance-based approaches using either angle detectors or convex hull detectors, the modified CSS descriptor extracts the fingertips and finger-valleys more precisely since it is more robust to noisy or corrupted data;moreover, the local extrema of depths recover the fingertips of bending fingers well while traditional appearance-based approaches hardly work without matching models of hands. Experimental results show that our method captures the hand articulations more precisely compared with three state-of-the-art appearance-based approaches.展开更多
A piecewise algebraic curve is a curve determined by the zero set of a bivariate spline function. In this paper, we propose the Cayley-Bacharach theorem for continuous piecewise algebraic curves over cross-cut triangu...A piecewise algebraic curve is a curve determined by the zero set of a bivariate spline function. In this paper, we propose the Cayley-Bacharach theorem for continuous piecewise algebraic curves over cross-cut triangulations. We show that, if two continuous piecewise algebraic curves of degrees m and n respectively meet at ranT distinct points over a cross-cut triangulation, where T denotes the number of cells of the triangulation, then any continuous piecewise algebraic curve of degree m + n - 2 containing all but one point of them also contains the last point.展开更多
Crowd counting provides an important foundation for public security and urban management.Due to the existence of small targets and large density variations in crowd images,crowd counting is a challenging task.Mainstre...Crowd counting provides an important foundation for public security and urban management.Due to the existence of small targets and large density variations in crowd images,crowd counting is a challenging task.Mainstream methods usually apply convolution neural networks(CNNs)to regress a density map,which requires annotations of individual persons and counts.Weakly-supervised methods can avoid detailed labeling and only require counts as annotations of images,but existing methods fail to achieve satisfactory performance because a global perspective field and multi-level information are usually ignored.We propose a weakly-supervised method,DTCC,which effectively combines multi-level dilated convolution and transformer methods to realize end-to-end crowd counting.Its main components include a recursive swin transformer and a multi-level dilated convolution regression head.The recursive swin transformer combines a pyramid visual transformer with a fine-tuned recursive pyramid structure to capture deep multi-level crowd features,including global features.The multi-level dilated convolution regression head includes multi-level dilated convolution and a linear regression head for the feature extraction module.This module can capture both low-and high-level features simultaneously to enhance the receptive field.In addition,two regression head fusion mechanisms realize dynamic and mean fusion counting.Experiments on four well-known benchmark crowd counting datasets(UCF_CC_50,ShanghaiTech,UCF_QNRF,and JHU-Crowd++)show that DTCC achieves results superior to other weakly-supervised methods and comparable to fully-supervised methods.展开更多
基金Supported by National Natural Science Foundation of China (60496322), Natural Science Foundation of Beijing (4083034), and Scientific Research Common Program of Beijing Municipal Commission.of Education (KM200610005020)_ _ _
基金Authors The research project is partially supported by National Natural ScienceFoundation of China under Grant No. 62072015, U19B2039, U1811463National Key R&D Programof China 2018YFB1600903.
文摘PM2.5 concentration prediction is of great significance to environmental protection and human health.Achieving accurate prediction of PM2.5 concentration has become an important research task.However,PM2.5 pollutants can spread in the earth’s atmosphere,causing mutual influence between different cities.To effectively capture the air pollution relationship between cities,this paper proposes a novel spatiotemporal model combining graph attention neural network(GAT)and gated recurrent unit(GRU),named GAT-GRU for PM2.5 concentration prediction.Specifically,GAT is used to learn the spatial dependence of PM2.5 concentration data in different cities,and GRU is to extract the temporal dependence of the long-term data series.The proposed model integrates the learned spatio-temporal dependencies to capture long-term complex spatio-temporal features.Considering that air pollution is related to the meteorological conditions of the city,the knowledge acquired from meteorological data is used in the model to enhance PM2.5 prediction performance.The input of the GAT-GRU model consists of PM2.5 concentration data and meteorological data.In order to verify the effectiveness of the proposed GAT-GRU prediction model,this paper designs experiments on real-world datasets compared with other baselines.Experimental results prove that our model achieves excellent performance in PM2.5 concentration prediction.
基金National Natural Science Foundation of China,Grant/Award Numbers:62106009,62276010R&D Program of Beijing Municipal Education Commission,Grant/Award Numbers:KM202210005030,KZ202210005009。
文摘Inferring causal protein signalling networks from human immune system cell data is a promising approach to unravel the underlying tissue signalling biology and dysfunction in diseased cells,which has attracted considerable attention within the bioinformatics field.Recently,Bayesian network(BN)techniques have gained significant popularity in inferring causal protein signalling networks from multiparameter single-cell data.However,current BN methods may exhibit high computational complexity and ignore interactions among protein signalling molecules from different single cells.A novel BN method is presented for learning causal protein signalling networks based on parallel discrete artificial bee colony(PDABC),named PDABC.Specifically,PDABC is a score-based BN method that utilises the parallel artificial bee colony to search for the global optimal causal protein signalling networks with the highest discrete K2 metric.The experimental results on several simulated datasets,as well as a previously published multi-parameter fluorescence-activated cell sorter dataset,indicate that PDABC surpasses the existing state-of-the-art methods in terms of performance and computational efficiency.
基金Supported by National Natural Science Foundation of China (60533030) and Beijing Natural Science Foundation (4061001)
文摘因为它的 insensitivity, 3D 脸识别吸引越来越多的注意到照明和姿势的变化。有在这个话题要解决的许多关键问题,例如 3D 脸表示和有效多特征熔化。在这份报纸,一个新奇 3D 脸识别算法被建议,它的性能在 BJUT-3D 脸数据库上被表明。这个算法选择脸表面性质和相对关系矩阵的原则部件为脸表示特征。为每个特征的类似公制被定义。特征熔化策略被建议。它基于菲希尔是线性加权的策略线性判别式分析。最后,介绍算法在 BJUT-3D 脸数据库上被测试。算法和熔化策略的表演是令人满意的,这被结束。
基金supported by the National Natural Science Foundation of China(Nos.60825203,60973056,60973057,U0935004)National Technology Support Project(2007BAH13B01)+2 种基金Beijing Municipal Natural Science Foundation(4102009)Scientific Research Common Program of Beijing Municipal Commission of Education(KM200710005023)PHR(IHLB)
文摘Among the human users of the Internet of Things,the hearing-impaired is a special group of people for whom normal information expression forms,such as voice and video are unaccessible,and most of them have some difficulty in understanding information in text form.The hearing-impaired are accustomed to receiving information expressed in sign language.For this situation,a new information expression form for the Internet of Things oriented toward the hearing-impaired is proposed in this paper,and the new expression is based on sign language video synthesis.Under the sign synthesis frame,three modules are necessary:constructing database,searching for appropriate sign language video units and transition units,and generating interpolated frames.With this method,text information could be transformed into sign language expression for the hearing-impaired.
基金Project supported by the National Natural Science Foundation of China(Nos.61772049,61632006,61876012,U19B2039,and 61906011)the Beijing Natural Science Foundation of China(No.4202003)。
文摘Three-dimensional(3D)reconstruction of shapes is an important research topic in the fields of computer vision,computer graphics,pattern recognition,and virtual reality.Existing 3D reconstruction methods usually suffer from two bottlenecks:(1)they involve multiple manually designed states which can lead to cumulative errors,but can hardly learn semantic features of 3D shapes automatically;(2)they depend heavily on the content and quality of images,as well as precisely calibrated cameras.As a result,it is difficult to improve the reconstruction accuracy of those methods.3D reconstruction methods based on deep learning overcome both of these bottlenecks by automatically learning semantic features of 3D shapes from low-quality images using deep networks.However,while these methods have various architectures,in-depth analysis and comparisons of them are unavailable so far.We present a comprehensive survey of 3D reconstruction methods based on deep learning.First,based on different deep learning model architectures,we divide 3D reconstruction methods based on deep learning into four types,recurrent neural network,deep autoencoder,generative adversarial network,and convolutional neural network based methods,and analyze the corresponding methodologies carefully.Second,we investigate four representative databases that are commonly used by the above methods in detail.Third,we give a comprehensive comparison of 3D reconstruction methods based on deep learning,which consists of the results of different methods with respect to the same database,the results of each method with respect to different databases,and the robustness of each method with respect to the number of views.Finally,we discuss future development of 3D reconstruction methods based on deep learning.
基金Jian-Feng Cai is partially supported by the National Natural Science Foundation of USA(No.DMS 1418737).
文摘In image restoration,we usually assume that the underlying image has a good sparse approximation under a certain system.Wavelet tight frame system has been proven to be such an efficient system to sparsely approximate piecewise smooth images.Thus,it has been widely used in many practical image restoration problems.However,images from different scenarios are so diverse that no static wavelet tight frame system can sparsely approximate all of themwell.To overcome this,recently,Cai et.al.(Appl Comput Harmon Anal 37:89–105,2014)proposed a method that derives a data-driven tight frame adapted to the specific input image,leading to a better sparse approximation.The data-driven tight frame has been applied successfully to image denoising and CT image reconstruction.In this paper,we extend this data-driven tight frame construction method to multi-channel images.We construct a discrete tight frame system for each channel and assume their sparse coefficients have a joint sparsity.The multi-channel data-driven tight frame construction scheme is applied to joint color and depth image reconstruction.Experimental results show that the proposed approach has a better performance than state-of-the-art joint color and depth image reconstruction approaches.
基金supported by the National Natural Science Foundation of China(Nos.6122700461370120+5 种基金6139051061300065and 61402024)Beijing Municipal Natural Science Foundation,China(No.4142010)Beijing Municipal Commission of Education,China(No.km201410005013)the Funding Project for Academic Human Resources Development in Institutions of Higher Learning under the Jurisdiction of Beijing Municipality,China
文摘We propose a framework of hand articulation detection from a monocular depth image using curvature scale space(CSS) descriptors. We extract the hand contour from an input depth image, and obtain the fingertips and finger-valleys of the contour using the local extrema of a modified CSS map of the contour. Then we recover the undetected fingertips according to the local change of depths of points in the interior of the contour. Compared with traditional appearance-based approaches using either angle detectors or convex hull detectors, the modified CSS descriptor extracts the fingertips and finger-valleys more precisely since it is more robust to noisy or corrupted data;moreover, the local extrema of depths recover the fingertips of bending fingers well while traditional appearance-based approaches hardly work without matching models of hands. Experimental results show that our method captures the hand articulations more precisely compared with three state-of-the-art appearance-based approaches.
基金The first author is supported by National Natural Science Foundation of China (Grant Nos. U0935004, 11071031, 11001037, 10801024) and the Fundamental Research Funds for the Central Universities (Grant Nos. DUT10ZDll2, DUT10JS02) the second author is supported by the 973 Program (2011CB302703), National Natural Science Foundation of China (Grant Nos. U0935004, 60825203, 61033004, 60973056, 60973057, 61003182), and Beijing Natural Science Foundation (4102009) We thank the referees for valuable suggestions which improve the presentation of this paper.
文摘A piecewise algebraic curve is a curve determined by the zero set of a bivariate spline function. In this paper, we propose the Cayley-Bacharach theorem for continuous piecewise algebraic curves over cross-cut triangulations. We show that, if two continuous piecewise algebraic curves of degrees m and n respectively meet at ranT distinct points over a cross-cut triangulation, where T denotes the number of cells of the triangulation, then any continuous piecewise algebraic curve of degree m + n - 2 containing all but one point of them also contains the last point.
基金This research project was partially supported by the National Natural Science Foundation of China(Grant Nos.62072015,U19B2039,U1811463)the National Key R&D Program of China(Grant No.2018YFB1600903).
文摘Crowd counting provides an important foundation for public security and urban management.Due to the existence of small targets and large density variations in crowd images,crowd counting is a challenging task.Mainstream methods usually apply convolution neural networks(CNNs)to regress a density map,which requires annotations of individual persons and counts.Weakly-supervised methods can avoid detailed labeling and only require counts as annotations of images,but existing methods fail to achieve satisfactory performance because a global perspective field and multi-level information are usually ignored.We propose a weakly-supervised method,DTCC,which effectively combines multi-level dilated convolution and transformer methods to realize end-to-end crowd counting.Its main components include a recursive swin transformer and a multi-level dilated convolution regression head.The recursive swin transformer combines a pyramid visual transformer with a fine-tuned recursive pyramid structure to capture deep multi-level crowd features,including global features.The multi-level dilated convolution regression head includes multi-level dilated convolution and a linear regression head for the feature extraction module.This module can capture both low-and high-level features simultaneously to enhance the receptive field.In addition,two regression head fusion mechanisms realize dynamic and mean fusion counting.Experiments on four well-known benchmark crowd counting datasets(UCF_CC_50,ShanghaiTech,UCF_QNRF,and JHU-Crowd++)show that DTCC achieves results superior to other weakly-supervised methods and comparable to fully-supervised methods.