本文以河北省民宿的地理位置坐标数据以及携程网上的河北省民宿评论数据为研究对象,分别从地理空间和情感满意度两个角度对民宿进行研究。首先,利用最邻近分析、核密度分析等多种空间分析方法对民宿的空间分布特征进行探讨,得出河北省...本文以河北省民宿的地理位置坐标数据以及携程网上的河北省民宿评论数据为研究对象,分别从地理空间和情感满意度两个角度对民宿进行研究。首先,利用最邻近分析、核密度分析等多种空间分析方法对民宿的空间分布特征进行探讨,得出河北省民宿的分布特征。接着,利用LDA-LSTM模型对民宿评论文本数据进行分析,将LDA主题提取模型、Word2Vec词向量化以及Pagerank算法进行结合,实现对民宿主题词的二次挖掘。最后,结合LSTM神经网络模型,计算每个主题的满意度,并对影响住户满意度的因素进行具体分析。This article takes the geographic coordinates of homestays in Hebei Province and the review data of homestays in Hebei Province on Ctrip as the research objects, and studies homestays from two perspectives: geographic space and emotional satisfaction. Firstly, various spatial analysis methods such as nearest neighbor analysis and kernel density analysis are used to explore the spatial distribution characteristics of homestays, and the distribution characteristics of homestays in Hebei Province are obtained. Next, the LDA-LSTM model is used to analyze the text data of homestay comments. The LDA topic extraction model, Word2Vec word vectorization, and Pagerank algorithm are combined to achieve secondary mining of homestay topic words. Finally, the LSTM neural network model is combined to calculate the satisfaction of each topic and analyze the factors that affect household satisfaction.展开更多
分类问题是数据挖掘、机器学习等领域的基础性问题之一,然而多数分类方法仅关注向量值样本的分类问题,而对于实际中广泛存在的集值型数据样本的分类关注较少。本文提出了一种基于Wasserstein距离的无监督聚类算法(Wk-means),利用熵正则...分类问题是数据挖掘、机器学习等领域的基础性问题之一,然而多数分类方法仅关注向量值样本的分类问题,而对于实际中广泛存在的集值型数据样本的分类关注较少。本文提出了一种基于Wasserstein距离的无监督聚类算法(Wk-means),利用熵正则最优传输模型度量集值型数据点之间的距离,并结合聚类的思想设计了一个可用于集值型数据的Wk-means聚类方法。为验证方法的有效性,本文首先在几个公开数据集上进行了实验,结果证实了Wk-means在多样本、多类别、多特征的集值型数据中表现优异,并且通过统计检验表明本文算法与其他算法存在显著差异。随后将本文方法实际应用于滏阳河水质数据集,结果同样表明相比传统的数据聚类算法,Wk-means能够更准确地划分水质类别,且运行效率更高。本文提出的Wk-means算法在集值型水质数据的分类任务中表现出色,能够为环境监测和管理提供有价值的决策支持。Classification is one of the basic problems in data mining, machine learning and other fields. However, most classification methods only focus on the vector-valued samples, while paying less attention to the classification of set-valued data samples that are widely existed in practice. This paper proposes an unsupervised clustering algorithm (Wk-means) based on Wasserstein distance. Combined with the idea of clustering, Wk-means can be used for set-valued samples, in which the entropy-regularized optimal transport model is used to measure the distance between set-valued samples. In order to verify the effectiveness of Wk-means, experiments are conducted firstly on several public data sets. The results confirm the excellent performance of Wk-means in set-valued data with multi-sample, multi-category, and multi-feature. Moreover, the statistical test show that Wk-means is significantly different from other algorithms. Wk-means is then applied to the Fuyang River water quality data set. The results also show that Wk-means can classify water quality categories more accurately and effectively than the traditional data clustering algorithm. The Wk-means algorithm proposed in this paper performs well in the classification task of set-valued water quality data and can provide valuable decision support for environmental monitoring and management.展开更多
Dirac算子零化的Clifford值函数称为正则函数,正则函数是全纯函数在高维空间中非交换领域的推广。双正则函数是双变量的正则函数。正则函数的增长性问题是Clifford分析中的重要问题之一。本文研究单位球上双正则函数的增长性问题。借鉴W...Dirac算子零化的Clifford值函数称为正则函数,正则函数是全纯函数在高维空间中非交换领域的推广。双正则函数是双变量的正则函数。正则函数的增长性问题是Clifford分析中的重要问题之一。本文研究单位球上双正则函数的增长性问题。借鉴Wiman-Valiron理论,利用双正则函数的Taylor级数,研究双正则函数的增长阶,得到广义Lindelöf-Pringsheim定理,建立增长阶与Taylor级数的联系。The Clifford-valued functions of null-solutions of Dirac operator are called regular functions. A regular function is an extension of holomorphic functions in non-commutative domains in high-dimensional spaces. Biregular functions are regular functions of two variables. The growth problem of regular functions is one of the important problems in Clifford analysis. In this paper, we investigate the growth problem of biregular functions in unit balls. Drawing on Wiman-Valiron theory, the growth order of biregular functions is studied by using the Taylor series of biregular functions, and the generalization of Lindelöf-Pringsheim theorem is obtained. This theorem shows the relation between the growth order of biregular functions and the Taylor series.展开更多
文摘本文以河北省民宿的地理位置坐标数据以及携程网上的河北省民宿评论数据为研究对象,分别从地理空间和情感满意度两个角度对民宿进行研究。首先,利用最邻近分析、核密度分析等多种空间分析方法对民宿的空间分布特征进行探讨,得出河北省民宿的分布特征。接着,利用LDA-LSTM模型对民宿评论文本数据进行分析,将LDA主题提取模型、Word2Vec词向量化以及Pagerank算法进行结合,实现对民宿主题词的二次挖掘。最后,结合LSTM神经网络模型,计算每个主题的满意度,并对影响住户满意度的因素进行具体分析。This article takes the geographic coordinates of homestays in Hebei Province and the review data of homestays in Hebei Province on Ctrip as the research objects, and studies homestays from two perspectives: geographic space and emotional satisfaction. Firstly, various spatial analysis methods such as nearest neighbor analysis and kernel density analysis are used to explore the spatial distribution characteristics of homestays, and the distribution characteristics of homestays in Hebei Province are obtained. Next, the LDA-LSTM model is used to analyze the text data of homestay comments. The LDA topic extraction model, Word2Vec word vectorization, and Pagerank algorithm are combined to achieve secondary mining of homestay topic words. Finally, the LSTM neural network model is combined to calculate the satisfaction of each topic and analyze the factors that affect household satisfaction.
文摘分类问题是数据挖掘、机器学习等领域的基础性问题之一,然而多数分类方法仅关注向量值样本的分类问题,而对于实际中广泛存在的集值型数据样本的分类关注较少。本文提出了一种基于Wasserstein距离的无监督聚类算法(Wk-means),利用熵正则最优传输模型度量集值型数据点之间的距离,并结合聚类的思想设计了一个可用于集值型数据的Wk-means聚类方法。为验证方法的有效性,本文首先在几个公开数据集上进行了实验,结果证实了Wk-means在多样本、多类别、多特征的集值型数据中表现优异,并且通过统计检验表明本文算法与其他算法存在显著差异。随后将本文方法实际应用于滏阳河水质数据集,结果同样表明相比传统的数据聚类算法,Wk-means能够更准确地划分水质类别,且运行效率更高。本文提出的Wk-means算法在集值型水质数据的分类任务中表现出色,能够为环境监测和管理提供有价值的决策支持。Classification is one of the basic problems in data mining, machine learning and other fields. However, most classification methods only focus on the vector-valued samples, while paying less attention to the classification of set-valued data samples that are widely existed in practice. This paper proposes an unsupervised clustering algorithm (Wk-means) based on Wasserstein distance. Combined with the idea of clustering, Wk-means can be used for set-valued samples, in which the entropy-regularized optimal transport model is used to measure the distance between set-valued samples. In order to verify the effectiveness of Wk-means, experiments are conducted firstly on several public data sets. The results confirm the excellent performance of Wk-means in set-valued data with multi-sample, multi-category, and multi-feature. Moreover, the statistical test show that Wk-means is significantly different from other algorithms. Wk-means is then applied to the Fuyang River water quality data set. The results also show that Wk-means can classify water quality categories more accurately and effectively than the traditional data clustering algorithm. The Wk-means algorithm proposed in this paper performs well in the classification task of set-valued water quality data and can provide valuable decision support for environmental monitoring and management.
文摘Dirac算子零化的Clifford值函数称为正则函数,正则函数是全纯函数在高维空间中非交换领域的推广。双正则函数是双变量的正则函数。正则函数的增长性问题是Clifford分析中的重要问题之一。本文研究单位球上双正则函数的增长性问题。借鉴Wiman-Valiron理论,利用双正则函数的Taylor级数,研究双正则函数的增长阶,得到广义Lindelöf-Pringsheim定理,建立增长阶与Taylor级数的联系。The Clifford-valued functions of null-solutions of Dirac operator are called regular functions. A regular function is an extension of holomorphic functions in non-commutative domains in high-dimensional spaces. Biregular functions are regular functions of two variables. The growth problem of regular functions is one of the important problems in Clifford analysis. In this paper, we investigate the growth problem of biregular functions in unit balls. Drawing on Wiman-Valiron theory, the growth order of biregular functions is studied by using the Taylor series of biregular functions, and the generalization of Lindelöf-Pringsheim theorem is obtained. This theorem shows the relation between the growth order of biregular functions and the Taylor series.