The 3780-point FFT is a main component of the time domain synchronous OFDM (TDS-OFDM) system and the key technology in the Chinese Digital Multimedia/TV Broadcasting-Terrestrial (DMB-T) national standard. Sinc, e ...The 3780-point FFT is a main component of the time domain synchronous OFDM (TDS-OFDM) system and the key technology in the Chinese Digital Multimedia/TV Broadcasting-Terrestrial (DMB-T) national standard. Sinc, e 3780 is not a power of 2, the classical radix-2 or radix-4 FFT algorithm cannot be applied directly. Hence, the Winograd Fourier transform algorithm (WFTA) and the Good-Thomas prime factor algorithm (PFA) are used to implement the 3780-point FFT processor. However, the structure based on WFTA and PFA has a large computational complexity and requires many DSPs in hardware implementation. In this paper, a novel 3780-point FFT processor scheme is proposed, in which a 60x63 iterative WFTA architecture with different mapping methods is imported to replace the PFA architecture, and an optimized CoOrdinate Rotation Digital Computer (CORDIC) module is used for the twiddle factor multiplications. Compared to the traditional scheme, our proposed 3780-point FFT processor scheme reduces the number of multiplications by 45% at the cost of 1% increase in the number of additions. All DSPs are replaced by the optimized CORDIC module and ROM. Simulation results show that the proposed 3780-point FFT processing scheme satisfies the requirement of the DMB-T standard, and is an efficient architecture for the TDS-OFDM system.展开更多
Robust object tracking has been an important and challenging research area in the field of computer vision for decades. With the increasing popularity of affordable depth sensors, range data is widely used in visual t...Robust object tracking has been an important and challenging research area in the field of computer vision for decades. With the increasing popularity of affordable depth sensors, range data is widely used in visual tracking for its ability to provide robustness to varying illumination and occlusions. In this paper, a novel RGBD and sparse learning based tracker is proposed. The range data is integrated into the sparse learning framework in three respects. First, an extra depth view is added to the color image based visual features as an independent view for robust appearance modeling. Then, a special occlusion template set is designed to replenish the existing dictionary for handling various occlusion conditions. Finally, a depth-based occlusion detection method is proposed to efficiently determine an accurate time for the template update. Extensive experiments on both KITTI and Princeton data sets demonstrate that the proposed tracker outperforms the state-of-the-art tracking algorithms, including both sparse learning and RGBD based methods.展开更多
Tracking multiple people under occlusion and across cameras is a challenging question for discussion. Furthermore, the cameras in this study are used to extend the field of view, which are distinguished from the same ...Tracking multiple people under occlusion and across cameras is a challenging question for discussion. Furthermore, the cameras in this study are used to extend the field of view, which are distinguished from the same field of view. Such corre- spondence between multiple cameras is a burgeoning research subject in the area of computer vision. This paper effectively solves the problems of tracking multiple people who pass from one camera to another and segmenting people under occlusion using probabilistic models. The probabilistic models are composed of blob model, motion model and color model, which make the most of the space, motion and color information. First, we present a color model that uses maximum likelihood estimation based on non-parametric kernel density estimation. Second, we introduce a blob model based on mean shift, which segments the body into many regions according to the color of each person in order to spatially localize the color features corresponding to the way people are dressed. Clothes can be any mixture of colors. Third, we bring forward a motion model based on statistical probability which indicates the movement position of the same person between two successive frames in a single camera. Finally, we effectively unify the three models into a general probabilistic model and attain a maximization likelihood probability image, which is used to segment the foreground region under occlusion and to match people across multiple cameras.展开更多
Omnidirectional imaging sensors have been used in more and more applications when a very large field of view is required.In this paper,we investigate the unwrapping,epipolar geometry and stereo rectification issues fo...Omnidirectional imaging sensors have been used in more and more applications when a very large field of view is required.In this paper,we investigate the unwrapping,epipolar geometry and stereo rectification issues for omnidirectional vision when the particular mirror model and the camera parameters are unknown in priori.First,the omnidirectional camera is calibrated under the Taylor model,and the parameters related to this model are obtained.In order to make the classical computer vision algorithms of conventional perspective cameras applicable,the ring omnidirectional image is unwrapped into two kinds of panoramas:cylinder and cuboid.Then the epipolar geometry of arbitrary camera configuration is analyzed and the essential matrix is deduced with its properties being indicated for ring images.After that,a simple stereo rectification method based on the essential matrix and the conformal mapping is proposed.Simulations and real data experimental results illustrate that our methods are effective for the omnidirectional camera under the constraint of a single view point.展开更多
This paper deals with a novel stereo algorithm that can generate accurate dense disparity maps in real time. The algorithm employs an effective cross-based variable support aggregation strategy within a scanline optim...This paper deals with a novel stereo algorithm that can generate accurate dense disparity maps in real time. The algorithm employs an effective cross-based variable support aggregation strategy within a scanline optimization framework. Rather than matching intensities directly, the use of adaptive support aggregation allows for precisely handling the weak textured regions as well as depth discontinuities. To improve the disparity results with global reasoning, we reformulate the energy function on a tree structure over the whole 2D image area, as opposed to dynamic programming of individual scanlines. By applying both intra- and inter-scanline optimizations, the algorithm reduces the typical 'streaking' artifact while maintaining high computational efficiency. The experimental results are evaluated on the Middlebury stereo dataset, showing that our approach is among the best for all real-time approaches. We implement the algorithm on a commodity graphics card with CUDA architecture, running at about 35 fames/s for a typical stereo pair with a resolution of 384×288 and 16 disparity levels.展开更多
基金Project supported by the National Natural Science Foundation of China (No.61071129)the Science and Technology Department of Zhejiang Province,China (Nos.2008C21088,2011R10035,and 2011R09003-06)
文摘The 3780-point FFT is a main component of the time domain synchronous OFDM (TDS-OFDM) system and the key technology in the Chinese Digital Multimedia/TV Broadcasting-Terrestrial (DMB-T) national standard. Sinc, e 3780 is not a power of 2, the classical radix-2 or radix-4 FFT algorithm cannot be applied directly. Hence, the Winograd Fourier transform algorithm (WFTA) and the Good-Thomas prime factor algorithm (PFA) are used to implement the 3780-point FFT processor. However, the structure based on WFTA and PFA has a large computational complexity and requires many DSPs in hardware implementation. In this paper, a novel 3780-point FFT processor scheme is proposed, in which a 60x63 iterative WFTA architecture with different mapping methods is imported to replace the PFA architecture, and an optimized CoOrdinate Rotation Digital Computer (CORDIC) module is used for the twiddle factor multiplications. Compared to the traditional scheme, our proposed 3780-point FFT processor scheme reduces the number of multiplications by 45% at the cost of 1% increase in the number of additions. All DSPs are replaced by the optimized CORDIC module and ROM. Simulation results show that the proposed 3780-point FFT processing scheme satisfies the requirement of the DMB-T standard, and is an efficient architecture for the TDS-OFDM system.
基金the National Natural Science Foundation of China (No. 61571390) and the Fundamental Research Funds for the Central Universities, China (No. 2016QNA5004)
文摘Robust object tracking has been an important and challenging research area in the field of computer vision for decades. With the increasing popularity of affordable depth sensors, range data is widely used in visual tracking for its ability to provide robustness to varying illumination and occlusions. In this paper, a novel RGBD and sparse learning based tracker is proposed. The range data is integrated into the sparse learning framework in three respects. First, an extra depth view is added to the color image based visual features as an independent view for robust appearance modeling. Then, a special occlusion template set is designed to replenish the existing dictionary for handling various occlusion conditions. Finally, a depth-based occlusion detection method is proposed to efficiently determine an accurate time for the template update. Extensive experiments on both KITTI and Princeton data sets demonstrate that the proposed tracker outperforms the state-of-the-art tracking algorithms, including both sparse learning and RGBD based methods.
文摘Tracking multiple people under occlusion and across cameras is a challenging question for discussion. Furthermore, the cameras in this study are used to extend the field of view, which are distinguished from the same field of view. Such corre- spondence between multiple cameras is a burgeoning research subject in the area of computer vision. This paper effectively solves the problems of tracking multiple people who pass from one camera to another and segmenting people under occlusion using probabilistic models. The probabilistic models are composed of blob model, motion model and color model, which make the most of the space, motion and color information. First, we present a color model that uses maximum likelihood estimation based on non-parametric kernel density estimation. Second, we introduce a blob model based on mean shift, which segments the body into many regions according to the color of each person in order to spatially localize the color features corresponding to the way people are dressed. Clothes can be any mixture of colors. Third, we bring forward a motion model based on statistical probability which indicates the movement position of the same person between two successive frames in a single camera. Finally, we effectively unify the three models into a general probabilistic model and attain a maximization likelihood probability image, which is used to segment the foreground region under occlusion and to match people across multiple cameras.
基金supported by the National Natural Science Foundation of China (Nos.60502006,60534070 and 90820306)the Science and Technology Plan of Zhejiang Province,China (No.2007C21007)
文摘Omnidirectional imaging sensors have been used in more and more applications when a very large field of view is required.In this paper,we investigate the unwrapping,epipolar geometry and stereo rectification issues for omnidirectional vision when the particular mirror model and the camera parameters are unknown in priori.First,the omnidirectional camera is calibrated under the Taylor model,and the parameters related to this model are obtained.In order to make the classical computer vision algorithms of conventional perspective cameras applicable,the ring omnidirectional image is unwrapped into two kinds of panoramas:cylinder and cuboid.Then the epipolar geometry of arbitrary camera configuration is analyzed and the essential matrix is deduced with its properties being indicated for ring images.After that,a simple stereo rectification method based on the essential matrix and the conformal mapping is proposed.Simulations and real data experimental results illustrate that our methods are effective for the omnidirectional camera under the constraint of a single view point.
基金supported by the National Natural Science Foundation of China (Nos. 60802013 and 61072081)the National Science and Technology Major Project of the Ministry of Science and Technology of China (No. 2009ZX01033-001-007)the China Postdoctoral Science Foundation (No. 20110491804)
文摘This paper deals with a novel stereo algorithm that can generate accurate dense disparity maps in real time. The algorithm employs an effective cross-based variable support aggregation strategy within a scanline optimization framework. Rather than matching intensities directly, the use of adaptive support aggregation allows for precisely handling the weak textured regions as well as depth discontinuities. To improve the disparity results with global reasoning, we reformulate the energy function on a tree structure over the whole 2D image area, as opposed to dynamic programming of individual scanlines. By applying both intra- and inter-scanline optimizations, the algorithm reduces the typical 'streaking' artifact while maintaining high computational efficiency. The experimental results are evaluated on the Middlebury stereo dataset, showing that our approach is among the best for all real-time approaches. We implement the algorithm on a commodity graphics card with CUDA architecture, running at about 35 fames/s for a typical stereo pair with a resolution of 384×288 and 16 disparity levels.