期刊文献+

Wed使用挖掘数据预处理中的一种会话识别改进算法

An Improved Algorithm for Session Identification in Data Preparation of Web Usage Mining
在线阅读 下载PDF
导出
摘要 本文论述了Web用户访问模式挖掘中的数据预处理,主要提出了数据预处理中如何识别会话的一种改进算法。该方法通过使用三个因素来构造会话:①根据先验知识,确定会话时间阈值识别会话;②根据页面访问时间统计分布,确定相邻网页访问时间间隔阈值识别会话;③页面内容及站点结构确定页面重要程度识别会话。实验结果表明,相对于传统的单一方法进行会话识别的方法,该方法能够准确的识别会话,更为合理有效。 This paper mainly discusses the data preparation of web usage mining, an improved algorithm for session identification in data preparation is proposed. This algorithm is according to three methods: 1.Define the session by session threshold, which was determined by experiences. 2.Define the session by page threshold, which was based on time distribution of all the page. 3.Define the session by importance of page and website' s structure. Compared with the traditional single method, this approach presented more accurately, it is more reasonable and effective.
出处 《科技广场》 2008年第7期85-87,共3页 Science Mosaic
关键词 访问模式挖掘 数据预处理 会话识别 阈值 网站结构 Web Log Data Mining Data Preparation Session Identification Threshold Website Structure
  • 相关文献

参考文献5

二级参考文献31

  • 1Kosala R, Blockeel H. Web mining research: a survey [J].S1GKDD Explorations, CM Newsletter of S1GKDD, 2000,2(1): 1-15.
  • 2Softwarelnc.Webtrends [EB/OL]. http ://www.Webtrends. com, 1995.
  • 3OpenMarketlnc. OpenmarketWebreporter [EB/OL]. http://www. openmarket, com, 1996.
  • 4NetGenesisCorp. Netanalysisdesktop [EB/OL].http ://www. netgen, com, 1996.
  • 5Chen M S, Park J S, Yu P S. Data mining For Path traversal patterns in a Web environment [A]. Proc 16th Int Conf on Distributed Compu Syst [C]. Hong Kong : IEEE Press,1996. 385-392.
  • 6Zaiane O R, Xin M, Han J. Discovering Web access patterns and trends by applying OLAP and datamining technology on Weblogs [A].Proc Advances in Digital Libraries Conf,ADL'98 [C]. Santa Barbara, CA: IEEE Press, 1998:19 -29.
  • 7Srivastava J, Cooley R, Dehpande M, et al. Web usuage mining: Discovery and applications of usage pattern from Web data [J]. SIGKDD Ezplorations, ACM Newsletter of SIGKDD, 2000; 1(2): 12- 23.
  • 8Chen Qian, Chang H, Govindan R, et al. The Origin of Power Laws in Internet Topologies Revisited [A].Proc IEEE Conf on Comp Commu (INFOCOM 2002) [C]. New York, NY: IEEE Press, 2002.
  • 9Huberman B A, Adamie L A. The nature of markets in the World Wide Web [J]. Quarterly J Economic Commerce,2000, 1: 5- 12.
  • 10Yang Qiang, Zhang Haining, Li Tianyi. Mining Web logs for prediction models in WWW caching and prefecting[C]//The Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD'01. San Francisco: ACM SIGKDD, 2001.

共引文献85

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部