In the process of large-scale,grid-connected wind power operations,it is important to establish an accurate probability distribution model for wind farm fluctuations.In this study,a wind power fluctuation modeling met...In the process of large-scale,grid-connected wind power operations,it is important to establish an accurate probability distribution model for wind farm fluctuations.In this study,a wind power fluctuation modeling method is proposed based on the method of moving average and adaptive nonparametric kernel density estimation(NPKDE)method.Firstly,the method of moving average is used to reduce the fluctuation of the sampling wind power component,and the probability characteristics of the modeling are then determined based on the NPKDE.Secondly,the model is improved adaptively,and is then solved by using constraint-order optimization.The simulation results show that this method has a better accuracy and applicability compared with the modeling method based on traditional parameter estimation,and solves the local adaptation problem of traditional NPKDE.展开更多
Previous research has identified specific areas of frequent tropical cyclone activity in the North Atlantic basin. This study examines long-term and decadal spatio-temporal patterns of Atlantic tropical cyclone freque...Previous research has identified specific areas of frequent tropical cyclone activity in the North Atlantic basin. This study examines long-term and decadal spatio-temporal patterns of Atlantic tropical cyclone frequencies from 1944 to 2009, and analyzes categorical and decadal centroid patterns using kernel density estimation (KDE) and centrographic statistics. Results corroborate previous research which has suggested that the Bermuda-Azores anticyclone plays an integral role in the direction of tropical cyclone tracks. Other teleconnections such as the North Atlantic Oscillation (NAO) may also have an impact on tropical cyclone tracks, but at a different temporal resolution. Results expand on existing knowledge of the spatial trends of tropical cyclones based on storm category and time through the use of spatial statistics. Overall, location of peak frequency varies by tropical cyclone category, with stronger storms being more concentrated in narrow regions of the southern Caribbean Sea and Gulf of Mexico, while weaker storms occur in a much larger area that encompasses much of the Caribbean Sea, Gulf of Mexico, and Atlantic Ocean off of the east coast of the United States. Additionally, the decadal centroids of tropical cyclone tracks have oscillated over a large area of the Atlantic Ocean for much of recorded history. Data collected since 1944 can be analyzed confidently to reveal these patterns.展开更多
The study of estimation of conditional extreme quantile in incomplete data frameworks is of growing interest. Specially, the estimation of the extreme value index in a censorship framework has been the purpose of many...The study of estimation of conditional extreme quantile in incomplete data frameworks is of growing interest. Specially, the estimation of the extreme value index in a censorship framework has been the purpose of many inves<span style="font-family:Verdana;">tigations when finite dimension covariate information has been considered. In this paper, the estimation of the conditional extreme quantile of a </span><span style="font-family:Verdana;">heavy-tailed distribution is discussed when some functional random covariate (</span><i><span style="font-family:Verdana;">i.e.</span></i><span style="font-family:Verdana;"> valued in some infinite-dimensional space) information is available and the scalar response variable is right-censored. A Weissman-type estimator of conditional extreme quantiles is proposed and its asymptotic normality is established under mild assumptions. A simulation study is conducted to assess the finite-sample behavior of the proposed estimator and a comparison with two simple estimations strategies is provided.</span>展开更多
There have been vast amount of studies on background modeling to detect moving objects. Two recent reviews[1,2] showed that kernel density estimation(KDE) method and Gaussian mixture model(GMM) perform about equally b...There have been vast amount of studies on background modeling to detect moving objects. Two recent reviews[1,2] showed that kernel density estimation(KDE) method and Gaussian mixture model(GMM) perform about equally best among possible background models. For KDE, the selection of kernel functions and their bandwidths greatly influence the performance. There were few attempts to compare the adequacy of functions for KDE. In this paper, we evaluate the performance of various functions for KDE. Functions tested include almost everyone cited in the literature and a new function, Laplacian of Gaussian(LoG) is also introduced for comparison. All tests were done on real videos with vary-ing background dynamics and results were analyzed both qualitatively and quantitatively. Effect of different bandwidths was also investigated.展开更多
Let X be a d-dimensional random vector with unknown density function f(z) = f (z1, ..., z(d)), and let f(n) be teh nearest neighbor estimator of f proposed by Loftsgaarden and Quesenberry (1965). In this paper, we est...Let X be a d-dimensional random vector with unknown density function f(z) = f (z1, ..., z(d)), and let f(n) be teh nearest neighbor estimator of f proposed by Loftsgaarden and Quesenberry (1965). In this paper, we established the law of the iterated logarithm of f(n) for general case of d greater-than-or-equal-to 1, which gives the exact pointwise strong convergence rate of f(n).展开更多
Monitoring sensors in complex engineering environments often record abnormal data,leading to significant positioning errors.To reduce the influence of abnormal arrival times,we introduce an innovative,outlier-robust l...Monitoring sensors in complex engineering environments often record abnormal data,leading to significant positioning errors.To reduce the influence of abnormal arrival times,we introduce an innovative,outlier-robust localization method that integrates kernel density estimation(KDE)with damping linear correction to enhance the precision of microseismic/acoustic emission(MS/AE)source positioning.Our approach systematically addresses abnormal arrival times through a three-step process:initial location by 4-arrival combinations,elimination of outliers based on three-dimensional KDE,and refinement using a linear correction with an adaptive damping factor.We validate our method through lead-breaking experiments,demonstrating over a 23%improvement in positioning accuracy with a maximum error of 9.12 mm(relative error of 15.80%)—outperforming 4 existing methods.Simulations under various system errors,outlier scales,and ratios substantiate our method’s superior performance.Field blasting experiments also confirm the practical applicability,with an average positioning error of 11.71 m(relative error of 7.59%),compared to 23.56,66.09,16.95,and 28.52 m for other methods.This research is significant as it enhances the robustness of MS/AE source localization when confronted with data anomalies.It also provides a practical solution for real-world engineering and safety monitoring applications.展开更多
As a key node for the surrounding area,metro stations are closely connected with the surrounding urban space and develop cooperatively.Different types of metro stations have differences in land use and functional posi...As a key node for the surrounding area,metro stations are closely connected with the surrounding urban space and develop cooperatively.Different types of metro stations have differences in land use and functional positioning.This paper mainly used the methods of Tyson polygon,kernel density analysis and correlation analysis,based on POI data,to classify the stations of Beijing Metro Line 7.This paper made a detailed analysis of commercial subway stations,and analyzed the distribution characteristics of commercial metro stations on Line 7.展开更多
In real-world applications, datasets frequently contain outliers, which can hinder the generalization ability of machine learning models. Bayesian classifiers, a popular supervised learning method, rely on accurate pr...In real-world applications, datasets frequently contain outliers, which can hinder the generalization ability of machine learning models. Bayesian classifiers, a popular supervised learning method, rely on accurate probability density estimation for classifying continuous datasets. However, achieving precise density estimation with datasets containing outliers poses a significant challenge. This paper introduces a Bayesian classifier that utilizes optimized robust kernel density estimation to address this issue. Our proposed method enhances the accuracy of probability density distribution estimation by mitigating the impact of outliers on the training sample’s estimated distribution. Unlike the conventional kernel density estimator, our robust estimator can be seen as a weighted kernel mapping summary for each sample. This kernel mapping performs the inner product in the Hilbert space, allowing the kernel density estimation to be considered the average of the samples’ mapping in the Hilbert space using a reproducing kernel. M-estimation techniques are used to obtain accurate mean values and solve the weights. Meanwhile, complete cross-validation is used as the objective function to search for the optimal bandwidth, which impacts the estimator. The Harris Hawks Optimisation optimizes the objective function to improve the estimation accuracy. The experimental results show that it outperforms other optimization algorithms regarding convergence speed and objective function value during the bandwidth search. The optimal robust kernel density estimator achieves better fitness performance than the traditional kernel density estimator when the training data contains outliers. The Naïve Bayesian with optimal robust kernel density estimation improves the generalization in the classification with outliers.展开更多
In order to improve the performance of the probability hypothesis density(PHD) algorithm based particle filter(PF) in terms of number estimation and states extraction of multiple targets, a new probability hypothesis ...In order to improve the performance of the probability hypothesis density(PHD) algorithm based particle filter(PF) in terms of number estimation and states extraction of multiple targets, a new probability hypothesis density filter algorithm based on marginalized particle and kernel density estimation is proposed, which utilizes the idea of marginalized particle filter to enhance the estimating performance of the PHD. The state variables are decomposed into linear and non-linear parts. The particle filter is adopted to predict and estimate the nonlinear states of multi-target after dimensionality reduction, while the Kalman filter is applied to estimate the linear parts under linear Gaussian condition. Embedding the information of the linear states into the estimated nonlinear states helps to reduce the estimating variance and improve the accuracy of target number estimation. The meanshift kernel density estimation, being of the inherent nature of searching peak value via an adaptive gradient ascent iteration, is introduced to cluster particles and extract target states, which is independent of the target number and can converge to the local peak position of the PHD distribution while avoiding the errors due to the inaccuracy in modeling and parameters estimation. Experiments show that the proposed algorithm can obtain higher tracking accuracy when using fewer sampling particles and is of lower computational complexity compared with the PF-PHD.展开更多
In this paper, we propose a new method that combines collage error in fractal domain and Hu moment invariants for image retrieval with a statistical method - variable bandwidth Kernel Density Estimation (KDE). The pro...In this paper, we propose a new method that combines collage error in fractal domain and Hu moment invariants for image retrieval with a statistical method - variable bandwidth Kernel Density Estimation (KDE). The proposed method is called CHK (KDE of Collage error and Hu moment) and it is tested on the Vistex texture database with 640 natural images. Experimental results show that the Average Retrieval Rate (ARR) can reach into 78.18%, which demonstrates that the proposed method performs better than the one with parameters respectively as well as the commonly used histogram method both on retrieval rate and retrieval time.展开更多
Let {Xn, n≥1} be a strictly stationary sequence of random variables, which are either associated or negatively associated, f(.) be their common density. In this paper, the author shows a central limit theorem for a k...Let {Xn, n≥1} be a strictly stationary sequence of random variables, which are either associated or negatively associated, f(.) be their common density. In this paper, the author shows a central limit theorem for a kernel estimate of f(.) under certain regular conditions.展开更多
A kernel density estimator is proposed when tile data are subject to censorship in multivariate case. The asymptotic normality, strong convergence and asymptotic optimal bandwidth which minimize the mean square error ...A kernel density estimator is proposed when tile data are subject to censorship in multivariate case. The asymptotic normality, strong convergence and asymptotic optimal bandwidth which minimize the mean square error of the estimator are studied.展开更多
Let (X,Y) be an R^d×R^1 valued random vector (X_1,Y_1),…, (X_n,Y_n) be a random sample drawn from (X,Y), and let E|Y|<∞. The regression function m(x)=E(Y|X=x) for x∈R^d is estimated by where, and h_n is a p...Let (X,Y) be an R^d×R^1 valued random vector (X_1,Y_1),…, (X_n,Y_n) be a random sample drawn from (X,Y), and let E|Y|<∞. The regression function m(x)=E(Y|X=x) for x∈R^d is estimated by where, and h_n is a positive number depending upon n only, nad K is a given nonnegative function on R^d. In the paper, we study the L_p convergence rate of kernel estimate m_n(x) of m(x) in suitable condition, and improve and extend the results of Wei Lansheng.展开更多
We prove that the density function of the gradient of a sufficiently smooth function , obtained via a random variable transformation of a uniformly distributed random variable, is increasingly closely approximated by ...We prove that the density function of the gradient of a sufficiently smooth function , obtained via a random variable transformation of a uniformly distributed random variable, is increasingly closely approximated by the normalized power spectrum of ?as the free parameter . The frequencies act as gradient histogram bins. The result is shown using the stationary phase approximation and standard integration techniques and requires proper ordering of limits. We highlight a relationship with the well-known characteristic function approach to density estimation, and detail why our result is distinct from this method. Our framework for computing the joint density of gradients is extremely fast and straightforward to implement requiring a single Fourier transform operation without explicitly computing the gradients.展开更多
Traditionally, it is widely accepted that measurement error usually obeys the normal distribution. However, in this paper a new idea is proposed that the error in digitized data which is a major derived data source in...Traditionally, it is widely accepted that measurement error usually obeys the normal distribution. However, in this paper a new idea is proposed that the error in digitized data which is a major derived data source in GIS does not obey the normal distribution but the p-norm distribution with a determinate parameter. Assuming that the error is random and has the same statistical properties, the probability density function of the normal distribution, Laplace distribution and p-norm distribution are derived based on the arithmetic mean axiom, median axiom and p-median axiom, which means that the normal distribution is only one of these distributions but not the least one. Based on this ideal distribution fitness tests such as Skewness and Kurtosis coefficient test, Pearson chi-square chi(2) test and Kolmogorov test for digitized data are conducted. The results show that the error in map digitization obeys the p-norm distribution whose parameter is close to 1.60. A least p-norm estimation and the least square estimation of digitized data are further analyzed, showing that the least p-norm adjustment is better than the least square adjustment for digitized data processing in GIS.展开更多
In the paper,we study the strong uniform consistency for the kernal estimates of random window w■th of density function and its derivatives under the condition that the sequence{X_n}of the ■ are the identically Φ-m...In the paper,we study the strong uniform consistency for the kernal estimates of random window w■th of density function and its derivatives under the condition that the sequence{X_n}of the ■ are the identically Φ-mixing random variabks.展开更多
The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two c...The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean.Due to the wide applicability of these tools,we present a Java-based computer application called KDX to facilitate the visualization of data and the utilization of these numerical tools.展开更多
The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial charact...The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial characteristics is presented to estimate road traffic states. Firstly, the representative road traffic state data were extracted to establish the reference sequences of road traffic running characteristics(RSRTRC). Secondly, the spatial road traffic state data sequence was selected and the kernel function was constructed, with which the spatial road traffic data sequence could be mapped into a high dimensional feature space. Thirdly, the referenced and current spatial road traffic data sequences were extracted and the Euclidean distances in the feature space between them were obtained. Finally, the road traffic states were estimated from weighted averages of the selected k road traffic states, which corresponded to the nearest Euclidean distances. Several typical links in Beijing were adopted for case studies. The final results of the experiments show that the accuracy of this algorithm for estimating speed and volume is 95.27% and 91.32% respectively, which prove that this road traffic states estimation approach based on kernel-KNN matching of road traffic spatial characteristics is feasible and can achieve a high accuracy.展开更多
A kernel-type estimator of the quantile function Q(p) = inf{t:F(t) ≥ p}, 0 ≤ p ≤ 1, is proposed based on the kernel smoother when the data are subjected to random truncation. The Bahadur-type representations o...A kernel-type estimator of the quantile function Q(p) = inf{t:F(t) ≥ p}, 0 ≤ p ≤ 1, is proposed based on the kernel smoother when the data are subjected to random truncation. The Bahadur-type representations of the kernel smooth estimator are established, and from Bahadur representations the authors can show that this estimator is strongly consistent, asymptotically normal, and weakly convergent.展开更多
In this paper two kernel density estimators are introduced and investigated. In order to reduce bias, we intuitively subtract an estimated bias term from ordinary kernel density estimator. The second proposed density ...In this paper two kernel density estimators are introduced and investigated. In order to reduce bias, we intuitively subtract an estimated bias term from ordinary kernel density estimator. The second proposed density estimator is a geometric extrapolation of the first bias reduced estimator. Theoretical properties such as bias, variance and mean squared error are investigated for both estimators. To observe their finite sample performance, a Monte Carlo simulation study based on small to moderately large samples is presented.展开更多
基金supported by Science and Technology project of the State Grid Corporation of China“Research on Active Development Planning Technology and Comprehensive Benefit Analysis Method for Regional Smart Grid Comprehensive Demonstration Zone”National Natural Science Foundation of China(51607104)
文摘In the process of large-scale,grid-connected wind power operations,it is important to establish an accurate probability distribution model for wind farm fluctuations.In this study,a wind power fluctuation modeling method is proposed based on the method of moving average and adaptive nonparametric kernel density estimation(NPKDE)method.Firstly,the method of moving average is used to reduce the fluctuation of the sampling wind power component,and the probability characteristics of the modeling are then determined based on the NPKDE.Secondly,the model is improved adaptively,and is then solved by using constraint-order optimization.The simulation results show that this method has a better accuracy and applicability compared with the modeling method based on traditional parameter estimation,and solves the local adaptation problem of traditional NPKDE.
文摘Previous research has identified specific areas of frequent tropical cyclone activity in the North Atlantic basin. This study examines long-term and decadal spatio-temporal patterns of Atlantic tropical cyclone frequencies from 1944 to 2009, and analyzes categorical and decadal centroid patterns using kernel density estimation (KDE) and centrographic statistics. Results corroborate previous research which has suggested that the Bermuda-Azores anticyclone plays an integral role in the direction of tropical cyclone tracks. Other teleconnections such as the North Atlantic Oscillation (NAO) may also have an impact on tropical cyclone tracks, but at a different temporal resolution. Results expand on existing knowledge of the spatial trends of tropical cyclones based on storm category and time through the use of spatial statistics. Overall, location of peak frequency varies by tropical cyclone category, with stronger storms being more concentrated in narrow regions of the southern Caribbean Sea and Gulf of Mexico, while weaker storms occur in a much larger area that encompasses much of the Caribbean Sea, Gulf of Mexico, and Atlantic Ocean off of the east coast of the United States. Additionally, the decadal centroids of tropical cyclone tracks have oscillated over a large area of the Atlantic Ocean for much of recorded history. Data collected since 1944 can be analyzed confidently to reveal these patterns.
文摘The study of estimation of conditional extreme quantile in incomplete data frameworks is of growing interest. Specially, the estimation of the extreme value index in a censorship framework has been the purpose of many inves<span style="font-family:Verdana;">tigations when finite dimension covariate information has been considered. In this paper, the estimation of the conditional extreme quantile of a </span><span style="font-family:Verdana;">heavy-tailed distribution is discussed when some functional random covariate (</span><i><span style="font-family:Verdana;">i.e.</span></i><span style="font-family:Verdana;"> valued in some infinite-dimensional space) information is available and the scalar response variable is right-censored. A Weissman-type estimator of conditional extreme quantiles is proposed and its asymptotic normality is established under mild assumptions. A simulation study is conducted to assess the finite-sample behavior of the proposed estimator and a comparison with two simple estimations strategies is provided.</span>
文摘There have been vast amount of studies on background modeling to detect moving objects. Two recent reviews[1,2] showed that kernel density estimation(KDE) method and Gaussian mixture model(GMM) perform about equally best among possible background models. For KDE, the selection of kernel functions and their bandwidths greatly influence the performance. There were few attempts to compare the adequacy of functions for KDE. In this paper, we evaluate the performance of various functions for KDE. Functions tested include almost everyone cited in the literature and a new function, Laplacian of Gaussian(LoG) is also introduced for comparison. All tests were done on real videos with vary-ing background dynamics and results were analyzed both qualitatively and quantitatively. Effect of different bandwidths was also investigated.
基金Research supported by National Natural Science Foundation of China.
文摘Let X be a d-dimensional random vector with unknown density function f(z) = f (z1, ..., z(d)), and let f(n) be teh nearest neighbor estimator of f proposed by Loftsgaarden and Quesenberry (1965). In this paper, we established the law of the iterated logarithm of f(n) for general case of d greater-than-or-equal-to 1, which gives the exact pointwise strong convergence rate of f(n).
基金the financial support provided by the National Key Research and Development Program for Young Scientists(No.2021YFC2900400)Postdoctoral Fellowship Program of China Postdoctoral Science Foundation(CPSF)(No.GZB20230914)+2 种基金National Natural Science Foundation of China(No.52304123)China Postdoctoral Science Foundation(No.2023M730412)Chongqing Outstanding Youth Science Foundation Program(No.CSTB2023NSCQ-JQX0027).
文摘Monitoring sensors in complex engineering environments often record abnormal data,leading to significant positioning errors.To reduce the influence of abnormal arrival times,we introduce an innovative,outlier-robust localization method that integrates kernel density estimation(KDE)with damping linear correction to enhance the precision of microseismic/acoustic emission(MS/AE)source positioning.Our approach systematically addresses abnormal arrival times through a three-step process:initial location by 4-arrival combinations,elimination of outliers based on three-dimensional KDE,and refinement using a linear correction with an adaptive damping factor.We validate our method through lead-breaking experiments,demonstrating over a 23%improvement in positioning accuracy with a maximum error of 9.12 mm(relative error of 15.80%)—outperforming 4 existing methods.Simulations under various system errors,outlier scales,and ratios substantiate our method’s superior performance.Field blasting experiments also confirm the practical applicability,with an average positioning error of 11.71 m(relative error of 7.59%),compared to 23.56,66.09,16.95,and 28.52 m for other methods.This research is significant as it enhances the robustness of MS/AE source localization when confronted with data anomalies.It also provides a practical solution for real-world engineering and safety monitoring applications.
基金Beijing Municipal Social Science Foundation(22GLC062).
文摘As a key node for the surrounding area,metro stations are closely connected with the surrounding urban space and develop cooperatively.Different types of metro stations have differences in land use and functional positioning.This paper mainly used the methods of Tyson polygon,kernel density analysis and correlation analysis,based on POI data,to classify the stations of Beijing Metro Line 7.This paper made a detailed analysis of commercial subway stations,and analyzed the distribution characteristics of commercial metro stations on Line 7.
文摘In real-world applications, datasets frequently contain outliers, which can hinder the generalization ability of machine learning models. Bayesian classifiers, a popular supervised learning method, rely on accurate probability density estimation for classifying continuous datasets. However, achieving precise density estimation with datasets containing outliers poses a significant challenge. This paper introduces a Bayesian classifier that utilizes optimized robust kernel density estimation to address this issue. Our proposed method enhances the accuracy of probability density distribution estimation by mitigating the impact of outliers on the training sample’s estimated distribution. Unlike the conventional kernel density estimator, our robust estimator can be seen as a weighted kernel mapping summary for each sample. This kernel mapping performs the inner product in the Hilbert space, allowing the kernel density estimation to be considered the average of the samples’ mapping in the Hilbert space using a reproducing kernel. M-estimation techniques are used to obtain accurate mean values and solve the weights. Meanwhile, complete cross-validation is used as the objective function to search for the optimal bandwidth, which impacts the estimator. The Harris Hawks Optimisation optimizes the objective function to improve the estimation accuracy. The experimental results show that it outperforms other optimization algorithms regarding convergence speed and objective function value during the bandwidth search. The optimal robust kernel density estimator achieves better fitness performance than the traditional kernel density estimator when the training data contains outliers. The Naïve Bayesian with optimal robust kernel density estimation improves the generalization in the classification with outliers.
基金Project(61101185) supported by the National Natural Science Foundation of ChinaProject(2011AA1221) supported by the National High Technology Research and Development Program of China
文摘In order to improve the performance of the probability hypothesis density(PHD) algorithm based particle filter(PF) in terms of number estimation and states extraction of multiple targets, a new probability hypothesis density filter algorithm based on marginalized particle and kernel density estimation is proposed, which utilizes the idea of marginalized particle filter to enhance the estimating performance of the PHD. The state variables are decomposed into linear and non-linear parts. The particle filter is adopted to predict and estimate the nonlinear states of multi-target after dimensionality reduction, while the Kalman filter is applied to estimate the linear parts under linear Gaussian condition. Embedding the information of the linear states into the estimated nonlinear states helps to reduce the estimating variance and improve the accuracy of target number estimation. The meanshift kernel density estimation, being of the inherent nature of searching peak value via an adaptive gradient ascent iteration, is introduced to cluster particles and extract target states, which is independent of the target number and can converge to the local peak position of the PHD distribution while avoiding the errors due to the inaccuracy in modeling and parameters estimation. Experiments show that the proposed algorithm can obtain higher tracking accuracy when using fewer sampling particles and is of lower computational complexity compared with the PF-PHD.
基金Supported by the Fundamental Research Funds for the Central Universities (No. NS2012093)
文摘In this paper, we propose a new method that combines collage error in fractal domain and Hu moment invariants for image retrieval with a statistical method - variable bandwidth Kernel Density Estimation (KDE). The proposed method is called CHK (KDE of Collage error and Hu moment) and it is tested on the Vistex texture database with 640 natural images. Experimental results show that the Average Retrieval Rate (ARR) can reach into 78.18%, which demonstrates that the proposed method performs better than the one with parameters respectively as well as the commonly used histogram method both on retrieval rate and retrieval time.
文摘Let {Xn, n≥1} be a strictly stationary sequence of random variables, which are either associated or negatively associated, f(.) be their common density. In this paper, the author shows a central limit theorem for a kernel estimate of f(.) under certain regular conditions.
文摘A kernel density estimator is proposed when tile data are subject to censorship in multivariate case. The asymptotic normality, strong convergence and asymptotic optimal bandwidth which minimize the mean square error of the estimator are studied.
文摘Let (X,Y) be an R^d×R^1 valued random vector (X_1,Y_1),…, (X_n,Y_n) be a random sample drawn from (X,Y), and let E|Y|<∞. The regression function m(x)=E(Y|X=x) for x∈R^d is estimated by where, and h_n is a positive number depending upon n only, nad K is a given nonnegative function on R^d. In the paper, we study the L_p convergence rate of kernel estimate m_n(x) of m(x) in suitable condition, and improve and extend the results of Wei Lansheng.
文摘We prove that the density function of the gradient of a sufficiently smooth function , obtained via a random variable transformation of a uniformly distributed random variable, is increasingly closely approximated by the normalized power spectrum of ?as the free parameter . The frequencies act as gradient histogram bins. The result is shown using the stationary phase approximation and standard integration techniques and requires proper ordering of limits. We highlight a relationship with the well-known characteristic function approach to density estimation, and detail why our result is distinct from this method. Our framework for computing the joint density of gradients is extremely fast and straightforward to implement requiring a single Fourier transform operation without explicitly computing the gradients.
文摘Traditionally, it is widely accepted that measurement error usually obeys the normal distribution. However, in this paper a new idea is proposed that the error in digitized data which is a major derived data source in GIS does not obey the normal distribution but the p-norm distribution with a determinate parameter. Assuming that the error is random and has the same statistical properties, the probability density function of the normal distribution, Laplace distribution and p-norm distribution are derived based on the arithmetic mean axiom, median axiom and p-median axiom, which means that the normal distribution is only one of these distributions but not the least one. Based on this ideal distribution fitness tests such as Skewness and Kurtosis coefficient test, Pearson chi-square chi(2) test and Kolmogorov test for digitized data are conducted. The results show that the error in map digitization obeys the p-norm distribution whose parameter is close to 1.60. A least p-norm estimation and the least square estimation of digitized data are further analyzed, showing that the least p-norm adjustment is better than the least square adjustment for digitized data processing in GIS.
基金supported by Natural Science Foun■ion of Henan P■visial Commission of Bdusation
文摘In the paper,we study the strong uniform consistency for the kernal estimates of random window w■th of density function and its derivatives under the condition that the sequence{X_n}of the ■ are the identically Φ-mixing random variabks.
文摘The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean.Due to the wide applicability of these tools,we present a Java-based computer application called KDX to facilitate the visualization of data and the utilization of these numerical tools.
基金Projects(LQ16E080012,LY14F030012)supported by the Zhejiang Provincial Natural Science Foundation,ChinaProject(61573317)supported by the National Natural Science Foundation of ChinaProject(2015001)supported by the Open Fund for a Key-Key Discipline of Zhejiang University of Technology,China
文摘The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial characteristics is presented to estimate road traffic states. Firstly, the representative road traffic state data were extracted to establish the reference sequences of road traffic running characteristics(RSRTRC). Secondly, the spatial road traffic state data sequence was selected and the kernel function was constructed, with which the spatial road traffic data sequence could be mapped into a high dimensional feature space. Thirdly, the referenced and current spatial road traffic data sequences were extracted and the Euclidean distances in the feature space between them were obtained. Finally, the road traffic states were estimated from weighted averages of the selected k road traffic states, which corresponded to the nearest Euclidean distances. Several typical links in Beijing were adopted for case studies. The final results of the experiments show that the accuracy of this algorithm for estimating speed and volume is 95.27% and 91.32% respectively, which prove that this road traffic states estimation approach based on kernel-KNN matching of road traffic spatial characteristics is feasible and can achieve a high accuracy.
基金Zhou's research was partially supported by the NNSF of China (10471140, 10571169)Wu's research was partially supported by NNSF of China (0571170)
文摘A kernel-type estimator of the quantile function Q(p) = inf{t:F(t) ≥ p}, 0 ≤ p ≤ 1, is proposed based on the kernel smoother when the data are subjected to random truncation. The Bahadur-type representations of the kernel smooth estimator are established, and from Bahadur representations the authors can show that this estimator is strongly consistent, asymptotically normal, and weakly convergent.
文摘In this paper two kernel density estimators are introduced and investigated. In order to reduce bias, we intuitively subtract an estimated bias term from ordinary kernel density estimator. The second proposed density estimator is a geometric extrapolation of the first bias reduced estimator. Theoretical properties such as bias, variance and mean squared error are investigated for both estimators. To observe their finite sample performance, a Monte Carlo simulation study based on small to moderately large samples is presented.