摘要
在大数据背景下,结合时间序列特点,数据量呈现多维急剧增长,极点在数据分析和预测中扮演相当重要的角色,如减缓大数据分析压力,然而传统的极点提取方法存在着不完整或极点提取错误的缺陷。为此,文章对时间序列数据进行归一化处理后,以极点的特殊性按交叉区间、趋势明显和趋势不明显改进极点提取算法,分析了改进后的等长区间的极点提取算法优势,以及通过实验对比优化后的基于趋势自适应处理的极点提取效果,结果表明该算法适应趋势明显、趋势不明显、局部数据骤变等不同类型的时间序列数据。对股票市场指数、汇率、GDP数据集等进行实验,结果表明该算法具有一定的普遍适用性,实验通过建立全极点序列和设置阈值压缩后的极点序列,以极点压缩率、损失率增强了算法的可伸缩性和扩展性,从而可以进一步适应不同数据类型的时间序列数据处理和研究的需求。
In the context of big data,combined with the characteristics of time series,the amount of data presents a multi-dimensional sharp increase.Extreme points play a very important role in data analysis and prediction,such as easing the pressure of big data analysis.However,traditional extreme point extraction methods have some defects such as incomplete or incorrect extraction.In view of this,after the normalization of time series data,this paper improves the extreme point extraction algorithm according to the particularity of extreme points including crossover intervals,obvious trend and not obvious trend,and then analyzes the advantages of improved extreme point extraction algorithm in equal length interval.Finally,the paper makes comparisons on the extreme point extraction effect based on the optimized trend adaptive processing through experiments.The results show that the algorithm is adaptable to different types of time series data such as obvious trend,not obvious trend and sudden change of local data.Experiments on stock market index,exchange rate and GDP data set show that the algorithm has universal applicability.By establishing the total extreme point sequence and setting the threshold compressed extreme point sequence in the experiment,the scalability and expansibility of the algorithm are enhanced by the extreme point compression rate and loss rate,so as to further adapt to the needs of different data types of time series data processing and research.
作者
卢民荣
郑建宁
Lu Minrong;Zheng Jianning(School of Accounting,Fujian Jiangxia University,Fuzhou 350108,China;Finance and Accounting Research Center,Fujian Jiangxia University,Fuzhou 350108,China;Fujian Yili Electric Power Technology Co.,Ltd.,Fuzhou 350003,China)
出处
《统计与决策》
CSSCI
北大核心
2021年第20期39-43,共5页
Statistics & Decision
基金
福建省社会科学基金重大项目(FJ2019JDZ053,FJ2020JDZ068,FJ2020JDZ070)
福建省财政资助科研项目(2021-11)。
关键词
时间序列
极值点
数据压缩
预处理
time series
extreme point
data compression
pretreatment